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© Intron sequence analysis method for detection of adjacent and remote locus alleles as haplotypes. 



© The present invention provides a method for detection of at least one allele of a genetic locus and can be 
used to provide direct determination of the haplotype. The method comprises amplifying genomic DNA with a 
primer pair that spans an intron sequence and defines a ONA sequence in genetic linkage with an allele to be 
detected. The primer-defined DNA sequence contains a sufficient number of intron sequence nucleotides to 
characterize the allele. Genomic DNA is amplified to produce an amplified DNA sequence characteristic of the 
allele. The amplified DNA sequence is analyzed to detect the presence of a genetic variation in the amplified 
DNA sequence such as a change in the length of the sequence, gain or loss of a restriction site or substitution of 
a nucleotide. The variation is characteristic of the allele to be detected and can be used to detect remove alleles. 
Kits comprising one or more of the reagents used in the method are also described. 
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INTRON SEQUENCE ANALYSIS METHOD FOR DETECTION OF ADJACENT AND REMOTE LOCUS ALLELES 

AS HAPLOTYPES 

The present invention relates to a method for detection of alleles and haplotypes and reagents therefor. 

Due in part to a number of new analytical techniques, there has been a significant increase in 
knowledge about genetic information, particularly in humans. Allelic variants of genetic loci have been 
correlated to malignant and non-malignant monogenic and multigenic diseases. For example, monogenic 
5 diseases for which the defective gene has been identified include DuChenne muscular dystrophy, sickle-cell 
anemia, Lesch Nyhan syndrome, hemophilia, beta-thalassemia, cystic fibrosis, polycystic kidney disease, 
ADA deficiency, a-1 -antitrypsin deficiency. Wilm's tumor and retinoblastoma. Other diseases which are 
believed to be monogenic for which the gene has not been identified include fragile X mental retardation 
and Huntington's chorea. 

w Genes associated with multigenic dieseases such as diabetes, colon cancer and premature coronary 
atherosclerosis have also been identified. 

In addition to identifying individuals at risk for or carriers of genetic dieseases, detection of allelic 
variants of a genetic locus has been used for orgaa transplantation, forensics, disputed paternity and a 
variety of other purposes in humans. In commercially important plants and animals, genes have not only 

15 been analyzed but genetically engineered and transmitted into other organisms. 

A number of techniques have been employed to detect allelic variants of genetic loci including analysis 
of restriction fragment length polymorphic (RFLP) patterns, use of oligonucleotide probes, and DNA 
amplification methods. One of the most complicated groups of allelic variants, the major histocompatibility 
complex (MHC), has been extensively studied. The problems encountered in attempting to determine the 

2o HLA type of an individual are exemplary of problems encountered in characterizing other genetic loci. 

The major histocompatibility complex is a cluster of genes that occupy a region on the short arm of 
chromosome 6. This complex, denoted the human leukocyte antigen (HLA) complex, includes at least 50 
loci. For the purposes of HLA tissue typing, two main classes of loci are recognized. The Class I loci 
encode transplantation antigens and are designated A, B and C. The Class II loci (DRA, DRB, DQA1, DQB, 

25 DPA and DPB) encode products that control immune responsiveness. Of the Class II loci, all the loci are 
polymorphic with the exception of the DRA locus. That is, the DRa antigen polypeptide sequence is 
invariant. 

HLA determinations are used in paternity determinations, transplant compatibility testing, forensics, 
blood component therapy, anthropological studies, and in disease association correlations to diagnose 

30 disease or predict diesease susceptibility. Due power of HLA to distinguish individuals and the need to 
match HLA type for transplantation, analytical methods to unambiguously characterize the alleles of the 
genetic loci associated with the complex have been sought. As present, DNA typing using TRFLP and 
oligonucleotide probes has been used to type Class II locus alleles. Alleles of Class I loci and Class II DR 
and DQ loci are typically determined by serological methods. The alleles of the Class II DP locus are 

35 determined by primed lymphocyte typing (PLT). 

Each of the HLA analysis methods has drawbacks. Serological methods require standard sera that are 
not widely available and must be continuously replenished. Additionally, serotyping is based on the reaction 
of the HLA gene products in the sample with the antibodies in the reagent sera. The antibodies recognize 
the expression products of the HLA genes on the surface of nucleated cells. The determination of fetal HLA 

40 type by serological methods may be difficult due to lack of maturation of expression of the antigens in fetal 
blood cells. 

Oligonucleotide probe typing can be performed in two days and has been further improved by the 
recent use of polymerase chain reaction (PCR) amplification. PCR-based oligoprobe typing has been 
performed on Class II loci. Primed lymphocyte typing requires 5 to 10 days to complete and involves cell 
45 culture with its difficulties and inherent variability. 

RFLP analysis is time consuming, requiring about 5 to 7 days to complete. Analysis of the fragment 
patterns is complex. Additionally, the technique requires the use of labelled probes. The most commonly 
used label, 32 P, presents well known drawbacks associated with the use of radionuclides. 

A fast, reliable method of genetic locus analysis is highly desirable, 
so U.S. Patent No. 4,683,195 (to Mullis et al, issued July 28. 1987) describes a process for amplifying, 
detecting and/or cloning nucleic acid sequences. The method involves treating separate complementary 
strands of DNA with two oligonucleotide primers, extending the primers to form complementary extension 
products that act as templates for synthesizing the desired nucleic acid sequence and detecting the 
amolified senuence. The method is commonlv referred to as the oolvmerase chain reaction seauence 



1 
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amplification method or PCR. Variations of the method are described in U.S. Patent No. 4,683.194 (to Saiki 
et al, issued July 28. 1987). The polymerase chain reaction sequence amplification method is also 
described by Saiki et al, Science, 230:1350-1354 (1985) and Scharf et al, Science, 324:163-166 (1986). 
U.S. Patent No. 4,582,788 (to Erlich, issued April 15, 1986) describes an HLA typing method based on 

5 restriction length polymorphism (RFLP) and cDNA probes used therewith. The method is carried out by 
digesting an individual's HLA DNA with a restriction endonuclease that produces a polymorphic digestion 
pattern, subjecting the digest to genomic blotting using a labelled cDNA probe that is complementary to an 
HLA DNA sequence involved in the polymorphism, and comparing the resulting genomic blotting pattern 
with a standard. Locus-specific probes for Class II loci (DQ) are also described. 

jo Kogan et al, New Engl. J. Med, 317:985-990 (1987) describes an improved PCR sequence amplification 
method that uses a heat-stable polymerase (Taq polymerase) and high temperature amplification. The 
stringent conditions used in the method provide sufficient fidelity of replication to permit analysis of the 
amplified DNA by determining DNA sequence lengths by visual inspection of an ethidium bromide-strained 
gel. The method was used to analyze DNA associated with hemophilia A in which additional tandem repeats 

;s of a DNA sequence are associated with the disease and the amplified sequences were significantly longer 
than sequences that are not associated with the disease. 

Simons and Erlich, pp 952-958 In: Immunology of HLA Vol. 1: Springer-Verlag, New York (1989) 
summarized RFLP-sequence interrelations at the DPA and DP8 loci. RFLP fragment patterns analyzed with 
probes by Southern blotting provided distinctive patterns for DPw1-5 alleles and the corresponding DP81 

20 allele sequences, characterized two subtypic patterns for DPw2 and DPw4, and identified new DPw alleles. 

Simons et al. pp 959-1023 In: Immunology of HLA Vol. 1: Springer-Verlag, New York (1989) summa-, 
rized restriction length polymorphisms of HLA sequences for class II loci as determined by the 10th 
International Workshop Southern Blot Analysis. Southern biot analysis was shown to be suitable for typing 
of the major classes of HLA loci. 

25 A series of three articles [Rommens et al, Science 245:1059-1065 (1989), Riordan et al. Science 
245:1066-1072 (1989) and Kerem et al, Science 245:1073-1079 (1989) report a new gene analysis method 
called "jumping" used to identify the location of the CF gene, the sequence of the CF gene, and the defect 
in the gene and its percentage in the disease population, respectively. 

DiLelia et al. The Lancet i:497-499 (1988) describes a screening method for detecting the two major 

30 alleles responsible for phenylketonuria in Caucasians of Northern European descent. The mutations, located 
at about the center of exon 12 and at the exon 12 junction with intervening sequence 12 are detected by 
PCR amplification of a 245 bp region of exon 12 and flanking intervening sequences. The amplified 
sequence encompasses both mutations and is analyzed using probes specific for each of the alleles 
(without prior electrophoretic separation). 

35 Dicker et at, BioTechniques 7:830-837 (1989) and Mardis et al, BioTechniques 7:840-850 (1989) report 
on automated techniques for sequencing of DNA sequences, particularly PCR-generated sequences. „ 
Each of the above-described references is incorporated herein by reference in its entirety. 
The present invention provides a method for detection of at least one allele of a genetic locus and can 
be used to provide direct determination of the haplotype. The method comprises amplifying genomic DNA 

40 with a primer pair that spans an intron sequence and defines a DNA sequence in genetic linkage with an 
allele to be detected. The primer-defined DNA sequence contains a sufficient number of intron sequence 
nucleotides to characterize the allele. Genomic DNA is amplified to produce an amplified DNA sequence 
characteristic of the allele. The amplified DNA sequence is analyzed to detect the presence of a genetic 
variation in the amplified DNA sequence such as a change in the length of the sequence, gain or loss of a 

45 restriction site or substitution of a nucleotide. The variation is characteristic of the allele to be detected. 

The present invention is based on the finding that intron sequences contain genetic variations that are 
characteristic of adjacent and remote alleles on the same chromosome. In particular, DNA sequences that 
include a sufficient number of intron sequence nucleotides can be used for direct determination of 
haplotype. 

so The method can be used to detect alleles of genetic loci for any eukaryotic organism. Of particular 
interest are loci associated with malignant and nonmalignant monogenic and multigenic diseases, and 
identification of individual organisms or species in both plants and animals. In a preferred embodiment, the 
method is used to determine HLA allele type and haplotype. 

Kits comprising one or more of the reagents used in the method are also described. 

55 The present invention provides a method for detection of alleles and haplotypes through analysis of 
intron sequence variation. The present invention is based on the discovery that amplification of intron 
sequences that exhibit linkage disequilibrium with adjacent and remote loci can be used to detect alleles of 
those loci. The present method reads haplotypes as the direct output of the intron typing analysts when a 
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single, individual organism is tested. The method is particularly useful in humans but is generally applicable 
to all eukaryotes, and is preferably used to analyze plant and animal species. 

The method comprises amplifying genomic DNA with a primer pair that spans an intron sequence and 
defines a DNA sequence in genetic linkage with an allele to be detected. Primer sites are located in 

5 conserved regions in the introns or exons bordering the intron sequence to be amplified. The primer- 
defined DNA sequence contains a sufficient number of intron sequence nucleotides to characterize the 
allele. The amplified DNA sequence is analyzed to detect the presence of a genetic variation such as a 
change in the length of the sequence, gain or loss of a restriction site of substitution of a nucleotide. 

The intron sequences provide genetic variations that, in addition to those found in exon sequences, 

to further distinguish sample DNA, providing additional information about the individual organism. This 
information is particularly valuable for identification of individuals such as in paternity determinations and in 
forensic applications. The information is also valuable in any other application where heterozygotes (two 
different alleles) are to be distinguished from homozygotes (two copies of one allele). 

More specifically, the present invention provides information regarding intron variation. Using the 

/s methods and reagents of this invention, two types of intron variation associated with genetic loci have been 
found. The first is allele-associated intron variation. That is, the intron variation pattern associates with the 
allele type at an adjacent locus. The second type of variation is associated with remote alleles (haplotypes). 
That is, the variation is present in individual organisms with the same genotype at the primary locus. 
Differences may occur between sequences of the same adjacent and remote locus types: However, 

20 individual-limited variation is uncommon. 

Furthermore, an amplified DNA sequence that contains sufficient intron sequences will vary depending . 
on the allele present in the sample. That is, the introns contain genetic variations (e.g. length polymor- 
phisms due to insertions and/or deletions and changes in the number or location of restriction sites) which 
are associated with the particular allele of the locus and with the alleles at remote loci. 

25 The reagents used in carrying out the methods of this invention are also described. The reagents can 
be provided in kit form comprising one or more of the reagents used in the method. 



Definitions 

30 

The term "allele", as used herein, means a genetic variation associated with a coding region; that is, an 
alternative form of the gene. 

The term "linkage", as used herein, refers to the degree to which regions of genomic DNA are inherited 
35 together. Regions on different chromosomes do not exhibit linkage and are inherited together 50% of the 
time. Adjacent genes that are always inherited together exhibit 100% linkage. 

The term "linkage disequilibrium", as used herein, refers to the co-occurrence of two alleles at linked 
loci such that the frequency of the co-occurrence of the alleles is greater than would be expected from the 
separate frequencies of occurrence of each allele. Alleles that co-occur with frequencies expected from their 
40 separate frequencies are said to be in "linkage equilibrium". 

As used herein, "haplotype" is a region of genomic DNA on a chromosome which is bounded by 
recombination sites such that genetic loci within a haplotypic region are usually inherited as a unit. 
However, occasionally, genetic rearrangements may occur within a haplotype. Thus, the term haplotype is 
an operational term that refers to the occurrence on a chromosome of linked loci. 
45 As used herein, the term "intron" refers to untranslated DNA sequences between exons, together with 
5 and 3 untranslated regions associated with a genetic locus. In addition, the term is used to refer to the 
spacing sequences between genetic loci (intergenic spacing sequences) which are not associated with a 
coding region and are colloquially referred to as "junk". While the art traditionally uses the term "intron" to 
refer only to untranslated sequences between exons, this expanded definition was necessitated by the lack 
so of any art recognized term which encompasses all non-exon sequences. 

As used herein, an "intervening sequence" is an intron which is located between two exons within a 
gene. The term does not encompass upstream and downstream noncoding sequences associated with the 
genetic locus. 

As used herein, the term "amplified DNA sequence" refers to DNA sequences which are copies of a 
55 portion of a DNA sequence and its complementary sequence, which copies correspond in nucleotide 
sequence to the original DNA sequence and its complementary sequence. 

The term "complement", as used herein, refers to a DNA sequence that is complementary to a 
specified DNA sequence. 
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The term "primer site", as used herein, refers to the area of the target DNA to which a primer 
hybridizes. 

The term "primer pair", as used herein, means a set of primers including a 5 upstream primer that 
hybridizes with the 5 end of the ONA sequence to be amplified and a 3 downstream primer that hybridizes 
s with the complement of the 3 end of the sequence to be amplified. 

The term "exon-limited primers", as used herein, means a primer pair having primers located within or 
just outside of an exon in a conserved portion of the intron. which primers amplify a ONA sequence which 
includes an exon or a portion thereof and not more than a small, para-exon region of the adjacent intron(s). 

The term "intron-spanning primers", as used herein, means a primer pair that amplifies at feast a 
w portion of one intron, which amplified intron region includes sequences which are not conserved. The intron- 
spanning primers can be located in conserved regions of the introns or in adjacent, upstream and/or 
downstream exon sequences. 

The term "genetic* locus", as used herein, means the region of the genomic DNA that includes the gene 
that encodes a protein including any upstream or downstream transcribed noncoding regions and asso- 
15 ciated regulatory regions. Therefore, an HLA locus is the region of the genomic DNA that includes the gene 
that encodes an HLA gene product. 

As used herein, the term "adjacent locus" refers to either (1) the locus in which a DNA sequence is 
located or (2) the nearest upstream or downstream genetic locus for intron DNA sequences not associated 
with a genetic locus. 

20 As used herein, the term "remote locus" refers to either (1) a locus which is upstream or downstream 
from the locus in which a DNA sequence is located or (2) for intron sequences not associated with a genetic 
locus, a locus which is upstream or downstream from the nearest upstream or downstream genetic locus to 
the intron sequence. 

The term "locus-specific primer", as used herein, means a primer that specifically hybridizes with a 
25 portion of the stated gene locus or its complementary strand, at least for one allele of the locus, and does 
not hybridize with other DNA sequences under the conditions used in the amplification method. 

As used herein, the terms "endonuclease" and "restriction endonuclease" refer to an enzyme that cuts 
double-stranded DNA having a particular nucleotide sequence. The specificities of numerous endonucleases 
are well known and can be found in a variety of publications, e.g. Molecular Cloning: A Laboratory Manual 
30 by Maniatis et al, Cold Spring Harbor Laboratory 1982. That manual is incorporated herein by reference in 
its entirety. 

The term "restriction fragment length polymorphism" (or RFLP), as used herein, refers to differences in 
DNA nucleotide sequences that produce fragments of different lengths when cleaved by a restriction 
endonuclease. 

35 The term "primer-defined length polymorphisms" (or PDLP), as used herein, refers to differences in the 
lengths of amplified DNA sequences due to insertions or deletions in the intron region of the locus included 
in the amplified DNA sequence. 

The term "HLA DNA", as used herein, means DNA that includes the genes that encode HLA antigens. 
HLA DNA is found in all nucleated human cells. 

40 

Primers 



45 The method of this invention is based on amplification of selected intron regions of genomic DNA. The 
methodology is facilitated by the use of primers that selectively amplify DNA associated with one or more 
alleles of a genetic locus of interest and not with other genetic loci. 

A locus-specific primer pair contains a 5 upstream primer that defines the 5 end of the amplified 
sequence by hybridizing with the 5 end of the target sequence to be amplified and a 3 downstream primer 

so that defines the 3 end of the amplified sequence by hybridizing with the complement of the 3 end of the 
DNA sequence to be amplified. The primers in the primer pair do not hybridize with DNA of other genetic 
loci under the conditions used in the present invention. 

For each primer of the locus-specific primer pair, the primer hybridizes to at least one allele of the DNA 
locus to be amplified or to its complement. A primer pair can be prepared for each allele of a selected 

55 locus, which primer pair amplifies only DNA for the selected locus. In this way combinations of primer pairs 
can be used to amplify genomic DNA of a particular locus, irrespective of which allele is present in a 
sample. Preferably, the primer pair amplifies DNA of at least two, more preferably more than two, alleles of 
a locus. In a most preferred embodiment, the primer sites are conserved, and thus amplify all haplotypes. 
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However, primer pairs or combinations thereof that specifically bind with the most common alleles present 
in a particular population group are also contemplated. 

The amplified ONA sequence that is defined by the primers contains a sufficient number of intron 
sequence nucleotides to distinguish between at least two alleles of an adjacent locus, and preferably, to 
5 identify the allele of the locus which is present in the sample. For some purposes, the sequence can also 
be selected to contain sufficient genetic variations to distinguish between individual organisms with the 
same allele or to distinguish between haplotypes. 



w Length of sequence 



The length of the amplified sequence which is required to include sufficient genetic variability to enable 
discrimination between aTl alleles of a locus bears a direct relation to the extent of the polymorphism of the 
locus (the number of alleles). That is, as the number of alleles of the tested locus increases, the size of an 

is amplified sequence which contains sufficient genetic variations to identify each allele increases. For a 
particular population group, one or more of the recognized alleles for any given locus may be absent from 
that group and need not be considered in determining a sequence which includes sufficient variability for 
that group. Conveniently, however, the primer pairs are selected to amplify a DNA sequence which is 
sufficient to distinguish between all recognized alleles of the tested locus. The same considerations apply 

20 when a haplotype is determined. 

For example, the least polymorphic HLA locus is DPA which currently has four recognized alleles. For , 
that locus, a primer pair which amplifies only a portion of the variable exon encoding the allelic variation 
contains sufficient genetic variability to distinguish between the alleles when the primer sites are located in 
an appropriate region of the variable exon. Exon-limited primers can be used to produce an amplified 

25 sequence that includes as few as about 200 nucleotides (nt). However, as the number of alleles of the locus 
increases, the number of genetic variations in the sequence must increase to distinguish all alleles. Addition 
of invariant exon sequences provides no additional genetic variation. When about eight or more alleles are 
to be distinguished, as for the DQA1 locus and more variable loci, amplified sequences should extend into 
at least one intron in the locus, preferably an intron adjacent to the variable exon. 

30 Additionally, where alleles of the locus exist which differ by a single basepair in the variable exon, intron 
sequences are included in amplified sequences to provide sufficient variability to distinguish alleles. For 
example, for the DQA1 locus (with eight currently recognized alleles) and the the DPB locus (with 24 
alleles), the OQA1.1/1.2 (now referred to as DQA1 0101/0102) and DPB2.1/4.2 (now referred to as 
DPB0201/0402) alleles differ by a single basepair. To distinguish those alleles, amplified sequences which 

35 include an intron sequence region are required. About 300 to 500 nucleotides is sufficient, depending on the 
location of the sequence. That is, 300 to 500 nucleotides comprised primarily of intron -sequence 
nucleotides sufficiently close to the variable exon are sufficient. 

For loci with more extensive polymorphisms (such as OQB with 14 currently recognized alleles, DPB 
with 24 currently recognized alleles, ORB with 34 currently recognized alleles and for each of the Class I 

40 loci), the amplified sequences need to be larger to provide sufficient variability to distinguish between all the 
alleles. An amplified sequence that includes at least about 0.5 kilobases (Kb), preferably at least about 1.0 
Kb, more preferably at least about 1 .5 Kb generally provides a sufficient number of restriction sites for loci 
with extensive polymorphisms. The amplified sequences used to* characterize highly polymorphic loci are 
generally between about 800 to about 2,000 nucleotides (nt), preferably between about 1000 to about 1800 

45 nucelotides in length. 

When haplotype information regarding remote alleles is desired, the sequences are generally between 
about 1,000 to about 2,000 nt in length. Longer sequences are required when the amplified sequence 
encompasses highly conserved regions such as exons or highly conserved intron regions, e.g., promoters, 
operators and other DNA regulatory regions. Longer amplified sequences (including more intron nucleotide 
50 sequences) are also required as the distance between the amplified sequences and the allele to be 
detected increases. 

Highly conserved regions included in the amplified DNA sequence, such as exon sequences or highly 
conserved intron sequences (e.g. promoters, enhancers, or other regulatory regions) may provide little or no 
genetic variation. Therefore, such regions do not contribute, or contribute only minimally, to the genetic 
55 variations present in the amplified DNA sequence. When such regions are included in the amplified DNA 
sequence, additional nucleotides may be required to encompass sufficient genetic variations to distinguish 
alleles, in comparison to an amplified DNA sequence of the same length including only intron sequences. 
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Location of the amplified DNA sequence 



The amplified DNA sequence is located in a region of genomic DNA that contains genetic variation 
which is in genetic linkage with the allele to be detected. Preferably, the sequence is located in an intron 

5 sequence adjacent to an exon of the genetic locus. More preferably, the amplified sequence includes an 
intervening sequence adjacent to an exon that encodes the allelic variability associated with the locus (a 
variable exon). The sequence preferably includes at least a portion of one of the introns adjacent to a 
variable exon and can include a portion of the variable exon. When additional sequence information is 
required, the amplified DNA sequence preferably encompasses a variable exon and all or a portion of both 

w adjacent intron sequences. 

Alternatively, the amplified sequence can be in an intron which does not border an exon of the genetic 
tocus. Such introns are located in the downstream or upstream gene flanking regions or even in awn 
intervening sequence lin another genetic locus which is in linkage disequilibrium with the allele to be 
detected. 

15 For some genetic loci, genomic DNA sequences may not be available. When only cDNA sequences are 
available and intron locations within the sequence are not identified, primers are selected at intervals of 
about 200 nt and used to amplify genomic DNA. If the amplified sequence contains about 200 nt, the 
location of the first primer is moved about 200 nt to one side of the second primer location and the 
amplification is repeated until either (1) an amplified DNA sequence that is larger than expected is produced 

20 or (2) no amplified DNA sequence is produced. In either case, the location of an intron sequence has been 
determined. The same methodology can be used when only the sequence of a marker site that is highly 
linked to the genetic locus is available, as is the case for many genes associated with inherited diseases. 

When the amplified DNA sequence does not include all or a portion of an intron adjacent to the variable 
exon(s), the sequence must also satisfy a second requirement. The amplified sequence must be sufficiently 

25 close to the variable exon(s) to exclude recombination and loss of linkage disequilibrium between the 
amplified sequence and the variable exon(s). This requirement is satisfied if the regions of the genomic 
DNA are within about 5 Kb, preferably within about 4 Kb, most preferably within 2 Kb of the variable exon- 
(s). The amplified sequence can be outside of the genetic locus but is preferably within the genetic locus. 
Preferably, for each primer pair, the amplified DNA sequence defined by the primers inlcudes at least 

30 200 nucleotides, and more preferably at least 400 nucelotides, of an intervening sequence adjacent to the 
variable exon(s). Although the variable exon usually provides fewer variations in a given number of 
nucleotides than an adjacent intervening* -sequence, each of those variations provides allele-relevant 
information. Therefore, inclusion of the variable exon provides an advantage. 

Since PCR methodology can be used to amplify sequences of several Kb, the primers can be located 

35 so that additional exons or intervening sequences are included in the amplified sequence. Of course, the 
increased size of the amplified DNA sequence increases the chance of replication error, so- addition of 
invariant regions provides some disadvantages. However, those disadvantages are not as likely to affect an 
analysis based on the length of the sequence or the RFLP fragment patterns as one based on sequencing 
the amplification product. For particular alleles, especially those with highly similar exon sequences, 

40 amplified sequences of greater than about 1 or 1.5 Kb may be necessary to discriminate between all alleles 
of a particular locus. 

The ends of the amplified DNA sequence are defined by the primer pair used in the amplification. Each 
primer sequence must correspond to a conserved region of the genomic DNA sequence. Therefore, the 
location of the amplified sequence will, to some extent, be dictated by the need to locate the primers in 

45 conserved regions. When sufficient intron sequence information to determine conserved intron regions is 
not available, the primers can be located in conserved portions of the exons and used to amplify intron 
sequences between those exons. 

When appropriately-located, conserved sequences are not unique to the genetic locus, a second primer 
located within the amplified sequence produced by the first primer pair can be used to provide an amplified 

so DNA sequence specific for the genetic locus. At least one of the primers of the second primer pair is 
located in a conserved region of the amplified DNA sequence defined by the first primer pair. The second 
primer pair is used following amplification with the first primer pair to amplify a portion of the amplified DNA 
sequence produced by the first primer pair. 

There are three major types of genetic variations that can be detected and used to identify an allele. 

55 Those variations, in order of ease of detection, are (1) a change in the length of the sequence. (2) a change 
in the presence or location of at least one restriction site and (3) the substitution of one or a few nucleotides 
that does not result in a change in a restriction site. Other variations within the amplified DNA sequence are 
also detectable. 
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There are three types of techniques which can be used to detect the variations. The first is sequencing 
the amplified DNA sequence. Sequencing is the most time consuming and also the most revealing 
analytical method, since it detects any type of genetic variation in the amplified sequence. The second 
analytical method uses allele-specific oligonucleotide or sequence-specific oligonucleotides probes (ASO or 
SSO probes). Probes can detect single nucleotide changes which result in any of the types of genetic 
variations, so long as the exact sequence of the variable site is known. A third type of analytical method 
detects sequences of different lengths (e.g., due to an insertion or deletion or a change in the location of a 
restriction site) and/or different numbers of sequences (due to either gain or loss of restriction sites). A 
preferred detection method is by gel or capillary electrophoresis. To detect changes in the lengths of 
fragments or the number of fragments due to changes in restriction sites, the amplified sequence must be 
digested with an appropriate restriction endonuclease prior to analysis of fragment length patterns. 

The first genetic variation is a difference in the length of the primer-defined amplified DNA sequence 
referred to herein as a primer-defined length polymorphism (PDLP), which difference in length distinguishes 
between at least two alleles of the genetic locus. The PDLPs result from insertions or deletions of large 
stretches (in comparison to the total length of the amplified DNA sequence) of DNA in the portion of the 
intron sequence defined by the primer pair. To detect PDLPs, the amplified DNA sequence is located in a 
region containing insertions or deletions of a size that is detectable by the chosen method. The amplified 
DNA sequence should have a length which provides optimal resolution of length differences. For elec- 
trophoresis, DNA sequences of about 300 to 500 bases in length provide optimal resolution of length 
differences. Nucleotide sequences which differ in length by as few as 3* nt, preferably 25 to 50 nt, can be 
distinguished. However, sequences as long as 800 to 2,000 nt which differ by at least about 50 nt are also 
readily distinguishable. Gel electrophoresis and capillary electrophoresis have similar limits of resolution. 
Preferably the length differences between amplified DNA sequences will be at least 10, more preferably 20, 
most preferably 50 or more, nt between the alleles. Preferably, the amplified DNA sequence is between 300 
to 1,000 nt and encompasses length differences of at least 3, preferably 10 or more nt. 

Preferably, the amplified sequence is located in an area which provides PDLP sequences that 
distinguish most or all of the alleles of a locus. An example of PDLP-based identification of five of the eight 
DQA1 alleles is described in detail in the examples. 

When the variation to be detected is a change in a restriction site, the amplified DNA sequence 
necessarily contains at least one restriction site which (1) is present in one allele and not in another, (2) is 
apparently located in a different position in the sequence of at least two alleles, or (3) combinations thereof. 
The amplified sequence will preferably be located such that restriction endonuclease cleavage produces 
fragments of detectably different lengths, rather than two or more fragments of approximately the same 
length. 

For allelic differences detected by ASO or SSO probes, the amplified DNA sequence includes a region 
of from about 200 to about 400 nt which is present in one or more alleles and not present in owe or mpre 
other alleles. In a most preferred embodiment, the sequence contains a region detectable by a probe that is 
present in only one allele of the genetic locus. However, combinations of probes which react with some 
alleles and not others can be used to characterize the alleles. 

For the method described herein, it is contemplated that use of more than one amplified DNA sequence 
and/or use of more than one analytical method per amplified DNA sequence may be required for highly 
polymorphic loci, particularly for loci where alleles differ by single nucleotide substitutions that are not 
unique to the allele or when information regarding remote alleles (haplotypes) is desired. More particularly, 
it may be necessary to combine a PDLP analysis with an RFLP analysis, to use two or more amplified DNA 
sequences located in different positions or to digest a single amplified DNA sequence with a plurality of 
endonucleases to distinguish all the alleles of some loci. These combinations are intended to be included 
within the scope of this invention. 

For example, the analysis of the haplotypes of DQA1 locus described in the examples used PDLPs and 
RFLP analysis using three different enzyme digests to distinguish the eight alleles and 20 of the 32 
haplotypes of the locus. 



Length and sequence homology of primers 



Each locus-specific primer includes a number of nucleotides which, under the conditions used in the 
hybridization, are sufficient to hybridize with an allele of the locus to be amplified and to be free from 
hybridization with alleles of other loci. The specificity of the primer increases with the number of nucleotides 
tn its sequence under conditions that provide the same stringency. Therefore, longer primers are desirable. 
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Sequences with fewer than 15 nucleotides are less certain to be specific for a particular locus. That is, 
sequences with fewer than 15 nucleotides are more likely to be present in a portion of the DNA associated 
with other genetic loci, particularly loci of other common origin or evolutionarily closely related origin, in 
inverse proportion to the length of the nucleotide sequence. 
5 Each primer preferably includes at least about 15 nucleotides, more preferably at least about 20 
nucleotides. The primer preferably does not exceed about 30 nucleotides, more preferably about 25 
nucleotides. Most preferably, the primers have between about 20 and about 25 nucleotides. 

A number of preferred primers are described herein. Each of those primers hybridizes with at least 
about 15 consecutive nucleotides of the designated region of the allele sequence. For many of the primers. 
w the sequence is not identical for all of the other alleles of the locus. For each of the primers, additional 
preferred primers have sequences which correspond to the sequences of the homologous region of other 
alleles of the locus or to their complements. 

When two sets of*primer pairs are used sequentially, with the second primer pair amplifying the product 
of the first primer pair, the primers can be the same size as those used for the first amplification. However. 

/5 smaller primers can be used in the second amplification and provide the requisite specificity. These smaller 
primers can be selected to be allele-specific, if desired. The primers of the second primer pair can have 15 
or fewer, preferably 8 to 12, more preferably 8 to 10 nucleotides. When two sets of primer pairs are used to 
produce two amplified sequences, the second amplified DNA sequence is used in the subsequent analysis 
of genetic variation and must meet the requirements discussed previously for the amplified DNA sequence. 

20 The primers preferably have a nucleotide sequence that is identical to a portion of the DNA sequence 
to be amplified or its complement. However, a primer having two nucleotides that differ from the target DNA 
sequence or its complement also can be used. Any nucleotides that are not identical to the sequence or its' 
complement are preferably not located at the 3' end of the primer. The 3' end of the primer preferably has 
at least two, preferably three or more, nucleotides that are complementary to the sequence to which the 

25 primer binds. Any nucleotides that are not identical to the sequence to be amplified or its complement will 
preferably not be adjacent in the primer sequence. More preferably, noncomplementary nucleotides in the 
primer sequence will be separated by at least three, more preferably at least five, nucleotides. The primers 
should have a melting temperature (T m ) from about 55 to 75* C. Preferably the T m is from about 60* C to 
about 65* C to facilitate stringent amplification conditions. 

30 The primers can be prepared using a number of methods, such as, for example, the phosphotriester 
and phosphodiester methods or automated embodiments thereof. The phosphodiester and phosphotriester 
methods are described in Cruthers, Science 230:281-285 (1985); Brown et al, Meth. Enzymol., 68:109 
(1979); and Nrang et al, Meth. Enzymol., 68:90 (1979). In one automated method, diethylphosphoramidities 
which can be synthesized as described by Beaucage et al, Tetrahedron letters. 22:1859-1962 (1981) are 

35 used as starting materials. A method for synthesizing primer oligonucleotide sequences on a modified solid 
support is described in U.S. Pat. No. 4,458.066, Each of the above references is incorporated herein by 
reference in its entirety. 

Exemplary primer sequences for analysis of Class I and Class II HLA loci; bovine leukocyte antigens, 
and cystic fibrosis are described herein. 

40 

Amplification 

45 The locus-specific primers are used in an amplification process to produce a sufficient amount of DNA 
for the analysis method. For production of RFLP fragment patterns or PDLP patterns which are analyzed by 
electrophoresis, about 1 to about 500 ng of DNA is required. A preferred amplification method is the 
polymerase chain reaction (PCR). PCR amplification methods are described in U.S. Patent No. 4,683.195 
(to Mullis et al. issued July 28, 1987); U.S Patent No. 4.683.194 (to Saiki et al. issued July 28. 1987); Saiki 

50 et al, Science, 230:1350-1354 (1985); Scharf et al. Science, 324:163-166 (1986); Kogan et al, New Engl. J. 
Med, 317:985-990 (1987) and Saiki, Gyllensten and Erlich, The Polymerase Chain Reaction in Genome 
Analysis: A Practical Approach, ed. Davies pp. 141-152, (1988) I.R.L. Press. Oxford. Each of the above 
references is incorporated herein by reference in its entirety. 

Prior to amplification, a sample of the individual organism's DNA is obtained. All nucleated cells contain 

55 genomic DNA and. therefore, are potential sources of the required DNA. For higher animals, peripheral 
blood cells are typically used rather than tissue samples. As little as 0.01 to 0.05 cc of peripheral blood 
provides sufficient DNA for amplification. Hair, semen and tissue can also be used as samples. In the case 
of fetal analyses, placental cells or fetal cells present in amniotic fluid can be used. The DNA is isolated 
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from nucleated cells under conditions that minimize DNA degradation. Typically, the isolation involves 
digesting the cells with a protease that does not attack DNA at a temperature and pH that reduces the 
likelihood of DNase activity. For peripheral blood cells, lysing the cells with a hypotonic solution (water) is 
sufficient to release the DNA. 

5 DNA isolation from nucleated cells is described by Kan et al, N. Engl. J. Med. 297:1080-1084 (1977); 
Kan et al. Nature 251:392-392 (1974); and Kan et al, PNAS 75:5631-5635 (1978). Each of the above 
references is incorporated herein by reference in its entirety. Extraction procedures for samples such as 
blood, semen, hair follicles, semen, mucous membrane epithelium and other sources of genomic DNA are 
well known. For plant cells, digestion of the cells with cellulase releases DNA. Thereafter DNA is purified as 

io described above. 

The extracted DNA can be purified by dialysis, chromatography, or other known methods for purifying 
polynucleotides prior to amplification. Typically, the DNA is not purified prior to amplification. 

The amplified DNA* sequence is produced by using the portion of the DNA and its complement 
bounded by the primer pair as a template. As a first step in the method, the DNA strands are separated into 

/5 single stranded DNA. This strand separation can be accomplished by a number of methods including 
physical or chemical means. A preferred method is the physical method of separating the strands by 
heating the DNA until it is substantially (approximately 93%) denatured. Heat denaturation involves 
temperatures ranging from about 80* to 105*C for times ranging from about 15 to 30 seconds. Typically, 
heating the DNA to a temperature of from 90* to 93 *C for about 30 seconds to about 1 minute iS sufficient. 

20 The primer extension product(s) produced are complementary to the primer-defined region of the DNA 
and hybridize therewith to form a duplex of equal length strands. The duplexes of the extension products 
and their templates are then separated into single-stranded DNA. When the complementary strands of the 
duplexes are separated, the strands are ready to be used as a template for the next cycle of synthesis of 
additional DNA strands. 

25 Each of the synthesis steps can be performed using conditions suitable for DNA amplification. 
Generally, the amplification step is performed in a buffered aqueous solution, preferably at a pH of about 7 
to about 9, more preferably about pH 8. A suitable amplification buffer contains Tris-HCI as a buffering 
agent in the range of about 10 to 100 mM. The buffer also includes a monovalent salt, preferably at a 
concentration of at least about 10 mM and not greater than about 60 mM. Preferred monovalent salts are 

30 KCI, NaCI and (NH+hSO*. The buffer also contains MgCb at about 5 to 50 mM. Other buffering systems 
such as hepes or glycine-NaOH and potassium phosphate buffers can be used. Typically, the total volume 
of the amplification reaction mixture is about SO to 100 uL 

Preferably, for genomic DNA, a molar excess of about 10 6 :1 primer:template of the primer pair is added 
to the buffer containing the separated DNA template strands. A large molar excess of the primers improves 

35 the efficiency of the amplification process. In general, about 100 to 150 ng of each primer is added. 

The deoxyribonucleotide triphosphates dATP, dCTP, dGTP and dTTP are also added to the~amplifica- 
tion mixture in amounts sufficient to produce the amplified DNA sequences. Preferably, the dNTPs are 
present at a concentration of about 0.75 to about 4.0 mM, more preferably about 2.0 mM. The resulting 
solution is heated to about 90* to 93* C for from about 30 seconds to about 1 minute to separate the 

40 strands of the DNA. After this heating period the solution is cooled to the amplification temperature. 

Following separation of the DNA strands, the primers are allowed to anneal to the strands. The 
annealing temperature varies with the length and GC content of the primers. Those variables are reflected in 
the T m of each primer. Exemplary HLA DQA1 primers of this invention, described below, require 
temperatures of about 55* C. The exemplary HLA Class I primers of this invention require slightly higher 

45 temperatures of about 62* to about 68* C. The extension reaction step is performed following annealing of 
the primers to the genomic DNA. 

An appropriate agent for inducing or catalyzing the primer extension reaction is added to the 
amplification mixture either before or after the strand separation (denaturation) step, depending on the 
stability of the agent under the denaturation conditions. The DNA synthesis reaction is allowed to occur 

so under conditions which are well known in the art. This synthesis reaction (primer extension) can occur at 
from room temperature up to a temperature above which the polymerase no longer functions efficiently. 
Elevating the amplification temperature enhances the stringency of the reaction. As stated previously, 
stringent conditions are necessary to ensure that the amplified sequence and the DNA template sequence 
contain the same nucleotide sequence, since substitution of nucleotides can alter the restriction sites or 

55 probe binding sites in the amplified sequence. 

The inducing agent may be any compound or system which facilitates synthesis of primer extension 
products, prefeably enzymes. Suitable enzymes for this purpose include DNA polymerases (such as. for 
example. E. coli DNA polymerase I. Klenow fragment of E. coli DNA polymerase I. T4 DNA polymerase), 
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reverse transcriptase, and other enzymes (including heat-stable polymerases) which facilitate combination 
of the nucleotides in the proper manner to form the primer extension products. Most preferred is Taq 
polymerase or other heat-stable polymerases which facilitate DNA synthesis at elevated temperatures 
(about 60* to 90* C). Taq polymerase is described, e.g., by Chien et at. J. Bacteriol.. 127:1550-1557 

5 (1976). The article is incorporated herein by reference in its entirety. When the extension step is performed 
at about 72* C. about 1 minute is required for every 1000 bases of target DNA to be amplified. 

The synthesis of the amplified sequence is initiated at the 3' end of each primer and proceeds toward 
the 5 end of the template along the template DNA strand, until synthesis terminates, producing ONA 
sequences of different lengths. The newly synthesized strand and its complementary strand form a double- 

w stranded molecule which is used in the succeeding steps of the process. In the next step, the strands of the 
double-stranded molecule are separated (denatured) as described above to provide single-stranded mol- 
ecules. 

New ONA is synthesized on the single-stranded template molecules. Additional polymerase, nucleotides 
and primers can be added if necessary for the reaction to proceed under the conditions described above. 

/5 After this step, half of the extension product consists of the amplified sequence bounded by the two 
primers. The steps of strand separation and extension product synthesis can be repeated as many times as 
needed to produce the desired quantity of the amplified DNA sequence. The amount of the amplified 
sequence produced accumulates exponentially. Typically, about 25 to 30 cycles are sufficient to produce a 
suitable amount of the amplified DNA sequence for analysis. 

20 The amplification method can be performed in a step-wise fashion where after each step new reagents 
are added, or simultaneously, where all reagents are added at the initial step, or partially step-wise and 
partially simultaneously, where fresh regent is added after a given number of steps. The amplification 
reaction mixture can contain, in addition to the sample genomic DNA, the four nucleotides, the primer pair 
in molar excess, and the inducing agent, e.g.. Taq polymerase. 

25 Each step of the process occurs sequentially notwithstanding the initial presence of all the reagents. 
Additional materials may be added as necessary. Typically, the polymerase is not replenished when using a 
heat-stabte polymerase. After the appropriate number of cycles to produce the desired amount of the 
amplified sequence, the reaction may be halted by inactivating the enzymes, separating the components of 
the reaction or stopping the thermal cycling. 

30 In a preferred embodiment of the method, the amplification includes the use of a second primer pair to 
perform a second amplification following the first amplification. The second primer pair defines a DNA 
sequence which is a portion of the first amplified sequence." That is, at least one of the primers of the 
second primer pair defines one end of the second amplified sequence which is within the ends of the first 
amplified sequence. In this way, the use of the second primer pair helps to ensure that any amplified 

35 sequence produced in the second amplification reaction is specific for the tested locus. That is, non-target 
sequences which may be copies by a locus-specific pair are unlikely to contain sequences -that hybridize 
with a second locus-specific primer pair located within the first amplified sequence. 

In another embodiment, the second primer pair is specific for one allele of the locus. In this way, 
detection of the presence of a second amplified sequence indicates that the allele is present in the sample. 

40 The presence of a second amplified sequence can be determined by quantitating the amount of DNA at the 
start and the end of the second amplification reaction. Methods for quantitating DNA are well known and 
include determining the optical density at 260 (OD 26 o), and preferably additionally determining the ratio of 
the optical density at 260 to the optical density at 280 (OD 26 o/OD 2 8o) to determine the amount of DNA in 
comparison to protein in the sample. 

45 Preferably, the first amplification will contain sufficient primer for only a limited number of primer 
extension cycles, e.g. less than 15, preferably about 10 to 12 cycles, so that the amount of amplified 
sequence produced by the process is sufficient for the second amplification but does not interfere with a 
determination of whether amplification occurred with the second primer pair. Alternatively, the amplification 
reaction can be continued for additional cycles and aliquoted to provide appropriate amounts of DNA for 

50 one or more second amplification reactions. Approximately 100 to 150 ng or each primer of the second 
primer pair is added to the amplification reaction mixture. The second set of primers is preferably added 
following the initial cycles with the first primer pair. The amount of the first primer pair can be limited in 
comparison to the second primer pair so that, following addition of the second pair, substantially all of the 
amplified sequences will be produced by the second pair. 

55 As stated previously, the DNA can be quantitated to determine whether an amplified sequence was 
produced in the second amplification. If protein in the reaction mixture interferes with the quantitation 
(usually due to the presence of the polymerase), the reaction mixture can be purified, as by using a 
100.000 MW cut off filter. Such filters are commercially available from Millipore and from Centricon. 



EP 0 414 469 A2 



Analysis of the Amplified DNA Sequence 



As discussed previously, the method used to analyze the amplified DNA sequence to characterize the 
5 allele(s) present in the sample DNA depends on the genetic variation in the sequence. When distinctions 
between alleles include primer-defined length polymorphisms, the amplified sequences are separated 
based on length, preferably using gel or capillary electrophoresis. When using probe hybridization for 
analysis, the amplified sequences are reacted with labeled probes. When the analysis is based on RFLP 
fragment patterns, the amplified sequences are digested with one or more restriction endonucleases to 
70 produce a digest and the resultant fragments are separated based on length, preferably using gel or 
capillary electrophoresis. When the only variation encompassed by the amplified sequence is a sequence 
variation that does not result in a change in length or a change in a restriction site and is unsuitable for 
detection by a probe, th^amplified DNA sequences are sequenced. 

Procedures for each step of the various analytical methods are well known and are described below. 

15 

Production of RFLP Fragment Patterns 



20 

Restriction endonucleases 

A restriction endonuclease is an enzyme that cleaves or cuts DNA hydrolytically at a specific nucleotide 
sequence called a restriction site. Endonucleases that produce blunt end DNA fragments (hydrolysis of the 
25 phosphodiester bonds on both DNA strands occur at the same site) as well as endonucleases that produce 
sticky ended fragments (the hydrolysis sites on the strands are separated by a few nucleotides from each 
other) can be used. 

Restriction enzymes are available commercially from a number of sources including Sigma Phar- 
maceuticals, Bethesda Research Labs, Boehringer-Manheim and Pharmacia. As stated previously, a 

30 restriction endonuclease used in the present invention cleaves an amplified DNA sequence of this invention 
to produce a digest comprising a set of fragments having distinctive fragment lengths. In particular, the 
fragments for one allele of a locus differ in size from the fragments for other alleles of the locus. The 
patterns produced by separation and visualization of the fragments of a plurality of digests are sufficient to 
distinguish each allele of the locus. More particularly, the endonucleases are chosen so that by using a 

35 plurality of digests of the amplified sequence, preferably fewer than five, more preferably two or three 
digests, the alleles of a locus can be distinguished. 

In selecting an endonuclease, the important consideration is the number of fragments produced for 
amplified sequences of the various alleles of a locus. More particularly, a sufficient number of fragments 
must be produced to distinguish between the alleles and, if required, to provide for individuality determina- 

40 tions. However, the number of fragments must not be so large or so similar in size that a pattern that is not 
distinguishable from those of other haplotypes by the particular detection method is produced. Preferably, 
the fragments are of distinctive sizes for each allele. That is, for each endonuclease digest of a particular 
amplified sequence, the fragments for an allele preferably differ from the fragments for every other allele of 
the locus by at least 10, preferably 20, more preferably 30. most preferably 50 or more nucleotides. 

45 One of ordinary skill can readily determine whether an endonuclease produces RFLP fragments having 
distinctive fragment lengths. The determination can be made experimentally by cleaving an amplified 
sequence for each allele with the designated endonuclease in the invention method. The fragment patterns 
can then by analyzed. Distinguishable patterns will be readily recognized by determining whether compari- 
son of two or more digest patterns is sufficient to demonstrate characteristic differences between the 

so patterns of the alleles. 

The number of digests that need to be prepared for any particular analysis will depend on the desired 
information and the particular sample to be analyzed. Since HLA analyses are used for a variety of 
purposes ranging from individuality determinations for forensics and paternity to tissue typing for transplan- 
tation, the HLA complex will be used as exemplary. 

55 A single digest may be sufficient to determine that an individual cannot be the person whose blood was 
found at a crime scene. In general, however, where the DNA samples do not differ, the use of two to three 
digests for each of two to three HLA loci will be sufficient for matching applications (forensics, paternity). 
For complete HLA typing, each locus needs to be determined. 
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In a preferred embodiment, sample HLA ONA sequences are divided into aliquots containing similar 
amounts of ONA per aliquot and are amplified with primer pairs (or combinations of primer pairs) to produce 
amplified ONA sequences for a number of HLA loci. Each amplification mixture contains only primer pairs 
for one HLA locus. The amplified sequences are preferably processed concurrently, so that a number of 
5 digest RFLP fragment patterns can be produced from one sample. In this way, the HLA type for a number 
of alleles can be determined simultaneously. 

Alternatively, preparation of a number of RFLP fragment patterns provides additional comparisons of 
patterns to distinguish samples for forensic and paternity analyses where analysis of one locus frequently 
fails to provide sufficient information for the determination when the sample ONA has the same allele as the 
io ONA to which it is compared. 



Production of RFLP fragments 



/5 Following amplification, the amplified ONA sequence is combined with an endonuclease that cleaves or 
cuts the amplified ONA sequence hydrolytically at a specific restriction site. The combination of the 
endonuclease with the amplified ONA sequence produces a digest containing a set of fragments having 
distinctive fragment lengths. U.S. Patent No. 4,582.788 (to Erlich, issued April 15. 1986) describes an HLA 
typing method based on restriction length polymorphism (RFLP). That patent is incorporated herein by 

20 reference in its entirety. 

In a preferred embodiment, two or more aliquots of the amplification reaction mixture having approxi- 
mately equal amounts of DNA per aliquot are prepared. Conveniently about 5 to about 10 ul of a 100 ul 
reaction mixture is used for each aliquot. Each aliquot is combined with a different endonuclease to produce 
a plurality of digests. In this way, by using a number of endonucleases for a particular amplified DNA 

25 sequence, locus-specific combinations of endonucleases that distinguish a plurality of alleles of a particular 
locus can be readily determined. Following preparation of the digests, each of the digests can be used to 
form RFLP patterns. Preferably, two or more digests can be pooled prior to pattern formation. 

Alternatively, two or more restriction endonucleases can be used to produce a single digest. The digest 
differs from one where each enzyme is used separately and the resultant fragments are pooled since 

30 fragments produced by one enzyme may include one or more restriction sites recognized by another 
enzyme in the digest. Patterns produced by simultaneous digestion by two or more enzymes will include 
more fragments than pooled products ofseparate digestion's using those enzymes and will be more 
complex to analyze. 

Furthermore, one or more restriction endonucleases can be used to digest two or more amplified DNA 

35 sequences. That is, for more complete resolution of all the alleles of a locus, it may be desirable to produce 
amplified ONA sequences encompassing two different regions. The amplified DNA sequences c§n be 
combined and digested with at least one restriction endonuclease to produce RFLP patterns. 

The digestion of the amplified DNA sequence with the endonuclease can be carried out in an aqueous 
solution under conditions favoring endonuclease activity. Typically the solution is buffered to a pH of about 

40 6.5 to 8.0. Mild temperatures, preferably about 20 *C to about 45 *C, more preferably physiological 
temperatures (25* to 40* C), are employed. Restriction endonucleases normally require magnesium ions 
and, in some instances, cofactors (ATP and S-adenosyl methionine) or other agents for their activity. 
Therefore, a source of such ions, for instance inorganic magnesium salts, and other agents, when required, 
are present in the digestion mixture. Suitable conditions are described by the manufacturer of the 

45 endonuclease and generally vary as to whether the endonuclease requires high, medium or low salt 
conditions for optimal activity. 

The amount of DNA in the digestion mixture is typically in the range of 1% to 20% by weight. In most 
instances 5 to 20 ug of total DNA digested to completion provides an adequate sample for production of 
RFLP fragments. Excess endonuclease, preferably one to five units/ug DNA, is used. 

so The set of fragments in the digest is preferably further processed to produce RFLP patterns which are 
analyzed. If desired, the digest can be purified by precipitation and resuspension as described by Kan et al, 
PNAS 75:5631-5635 (1978), prior to additional processing. That article is incorporated herein by reference 
in its entirety. 

Once produced, the fragments are analyzed by well known methods. Preferably, the fragments are 
55 analyzed using electrophoresis. Gel electrophoresis methods are described in detail hereinafter. Capillary 
electrophoresis methods can be automated as by using Model 207A analytical capillary electrophoresis 
system from Applied Biosystems of Foster City, CA) and are described in Chin et al. American Biotechnol- 
ogy Laboratory News Edition, December. 1989. 
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Electrophoretic Separation of ONA Fragments 



Electrophoresis is the separation of ONA sequence fragments contained in a supporting medium by 

5 size and charge under the influence of an applied electric field. Gel sheets or slabs, e.g. agarose, agarose- 
acrylamide or polyacrylamide, are typically used for nucleotide sizing gels. The electrophoresis conditions 
affect the desired degree of resolution of the fragments. A degree of resolution that separates fragments 
that differ in size from one another by as little as 10 nucleotides is usually sufficient. Preferably, the gels will 
be capable of resolving fragments which differ by 3 to 5 nucleotides. However, for some purposes (where 

to the differences in sequence length are large), discrimination of sequence differences of at least 100 nt may 
be sufficiently sensitive for the analysts. 

Preparation and staining of analytical gels is well known. For example, a 3% Nusieve 1% agarose gel 
which is stained using ethidium bromide is described in Boerwinkle et al, PNAS, 86:212-216 (1989). 
Detection of DNA in polyacrylamide gels using silver stain is described in Goldman et al, Electrophoresis, 

75 3:24-26 (1982); Marshall, Electrophoresis, 4:269-272 (1983); Tegelstrom, Electrophoresis, 7:226-229 (1987); 
and Allen et al, BioTechniques 7:736-744 (1989). The method described by Allen et al, using large-pore 
size ultrathin-layer, rehydratable polyacrylamide gels stained with silver is preferred. Each of those articles 
is incorporated herein by reference in its entirety. 

Size markers can be run on the same gel to permit estimation of the size of the restriction-fragments. 

20 Comparison to one or more control sample(s) can be made in addition to or. in place of the use of size 
markers. The size markers or control samples are usually run in one or both the lanes at the edge of the 
gel, and preferably, also in at least one central lane. In carrying out the electrophoresis, the DNA fragments 
are loaded onto one end of the get slab (commonly called the "origin") and the fragments separate by 
electrically facilitated transport through the gel, with the shortest fragment electrophoresing from the origin 

25 towards the other (anode) end of the slab at the fastest rate. An agarose slab gel is typically elec- 
trophoresed using about 100 volts for 30 to 45 minutes. A polyacrylamide slab gel is typically elec- 
trophoresed using about 200 to 1,200 volts for 45 to 60 minutes. 

After electrophoresis, the gel is readied for visualization. The DNA fragments can be visualized by 
staining the gel with a nucleic acid-specific stain such as ethidium bromide or, preferably, with silver stain, 

30 which is not specific for DNA. Ethidium bromide staining is described in Boerwinkte et al, supra. Silver 
staining is described in Goldman et al, supra, Marshall, supra, Tegelstrom, supra, and Allen et al, supra. 

Probes 

35 

Alfele-specific oligonucleotides or probes are used to identify DNA sequences which have regions that 
hybridize with the probe sequence. The amplified DNA sequences defined by a locus-specific primer pair 
can be used as probes in RFLP analyses using genomic DNA, U.S. Patent No. 4,582,788 (to Erlich, issued 

40 April 15, 1986) describes an exemplary HLA typing method based on analysis of RFLP patterns produced 
by genomic DNA. The analysis uses cDNA probes to analyze separated DNA fragments in a Southern blot 
type of analysis. As stated in the patent "[Complementary DNA probes that are specific to one (locus- 
specific) or more (multilocus) particular HLA DNA sequences involved in the polymorphism are essential 
components of the hybridization step of the typing method" (col. 6, 1. 3-7). 

45 The amplified DNA sequences of the present method can be used as probes in the method described 
in that patent or in the present method to detect the presence of an amplified DNA sequence of a particular 
allele. More specifically, an amplified DNA sequence having a known allele can be produced and used as a 
probe to detect the presence of the allele in sample ONA which is amplified by the present method. 

Preferably, however, when a probe is used to distinguish alleles in the amplified DNA sequences of the 

so present invention, the probe has a relatively short sequence (in comparison to the length of the amplified 
DNA sequence) which minimizes the sequence homology of other alleles of the locus with the probe 
sequence. That is, the probes will correspond to a region of the amplified DNA sequence which has the 
largest number of nucleotide differences from the amplified DNA sequences of other alleles produced using 
that primer pair. 

55 The probes can be labelled with a detectable atom, radical or ligand using known labeling techniques. 
Radiolabels, using 32 P, are typically used. The probes can be labeled with 32 P by nick translation with an a- 
32 P-dNTP (Rigby et al, J. Mol. BioL. 113:237 (1977)) or other available procedures to make the locus- 
specific probes for use in the methods described in the patent. The probes are preferably labeled with an 



EP 0 414 469 A2 



enzyme, such as hydrogen peroxidase. Coupling enzyme labels to nucleotide sequences are well known. 
Each of the above references is incorporated herein by reference in its entirety. 

The analysis method known as "Southern blotting" that is described by Southern, J. Mol. Biol., 98:503- 
517 (1975) is an analysis method that relies on the use of probes. In Southern blotting the DNA fragments 
5 are electrophoresed, transferred and affixed to a support that binds nucleic acid, and hybridized with an 
appropriately labeled cDNA probe. Labeled hybrids are detected by autoradiography, or preferably, use of 
enzyme labels. 

Reagents and conditions for blotting are described by Southern, supra; Wahl et al, PNAS 6:3683-3687 

(1979); Kan et al, PNAS. supra, U.S. Pat. No. 4:302,204 and Molecular Cloning: A Laboratory Manual by 
/o Maniatis et al, Cold Spring Harbor Laboratory 1982. After the transfer is complete the paper is separated 

from the gel and is dried. Hybridization (annealing) of the resolved single stranded DNA on the paper to an 

probe is effected by incubating the paper with the probe under hybridizing conditions. See Southern, supra; 

Kan et al, PNAS. supra and U.S. Pat. No. 4.302,204, col 5, line 8 et seq. Complementary DNA probes 

specific for one allele, one locus (locus-specific) or more are essential components of the hybridization step 
is of the typing method. Locus-specific probes can be made by the amplification method for locus-specific 

amplified sequences, described above. The probes are made detectable by labeling as described above. 
The final step in the Southern blotting method is identifying labeled hybrids on the paper (or gel in the 

solution hybridization embodiment). Autoradiography can be used to detect radiolabel-containing hybrids. 

Enzyme labels are detected by use of a color development system specific for the enzyme. In general, the 
20 enzyme cleaves a substrate, which cleavage either causes the substrate to develop or change color. The 

color can be visually perceptible in natural light or a fluorochrome which is excited by a known wavelength 

of light. 



25 Sequencing 



Genetic variations in amplified DNA sequences which reflect allelic difference in the sample DNA can 
also be detected by sequencing the amplified DNA sequences. Methods for sequencing oligonucleotide 
30 sequences are well known and are described in, for example, Molecular Cloning: A Laboratory Manual by 
Maniatis et al, Cold Spring Harbor Laboratory 1982. Currently, sequencing can be automated using a 
number of commercially available instruments. 

Due to the amount of time currently required to obtain sequencing information, other analysis methods, 
such as gel electrophoresis of the amplified DNA sequences or a restriction endonuclease digest thereof 
35 are preferred for clinical analyses. 



Kits 

40 

As stated previously, the kits of this invention comprise one or more of the reagents used in the above 
described methods. In one embodiment, a kit comprises at least one genetic locus-specific primer pair in a 
suitable container. Preferably the kit contains two or more locus-specific primer pairs. In one embodiment, 
the primer pairs are for different loci and are in separate containers. In another embodiment, the primer 

45 pairs are specific for the same locus. In that embodiment, the primer pairs will preferably be in the same 
container when specific for different alleles of the same genetic locus and in different containers when 
specific for different portions of the same allele sequence. Sets of primer pairs which are used sequentially 
can be provided in separate containers in one kit. The primers of each pair can be in separate containers, 
particularly when one primer is used in each set of primer pairs. However, each pair is preferably provided 

so at a concentration which facilitates use of the primers at the concentrations required for all amplifications in 
which it will be used. 

The primers can be provided in a small volume (e.g. 100 ul) of a suitable solution such as sterile water 
or Tris buffer and can be frozen. Alternatively, the primers can be air dried. 

In another embodiment, a kit comprises, in separate containers, two or more endonucleases useful in 
55 the methods of this invention. The kit will preferably contain a locus-specific combination of endonucleases. 
The endonucleases can be provided in a suitable solution such as normal saline or physiologic buffer with 
50% glycerol (at about -20* C) to maintain enzymatic activity. 

The kit can contain one or more locus-specific primer pairs together with locus-specific combinations of 
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endonucleases and may additionally include a control. The control can be an amplified DNA sequence 
. defined by a locus-specific primer pair or DNA having a known HLA type for a locus of interest. 

Additional reagents such as amplification buffer, digestion buffer, a DNA polymerase and nucleotide 
triphosphates can be provided separately or in the kit. The kit may additionally contain gel preparation and 
5 staining reagents or preformed gels. 

Analyses of exemplary genetic loci are described below. 

Analysis of the HLA Type 

w 

The present method of analysis of genetic variation in an amplified DNA sequence to determine allelic 
difference in sample DMA can be used to determine HLA type. Primer pairs that specifically amplify 
genomic DNA associated with one HLA locus are described in detail hereinafter. In a preferred embodiment, 
15 the primers define a DNA sequence that contains all exons that encode allelic variability associated with the 
HLA locus together with at least a portion of one of the adjacent intron sequences. For Class I loci, the 
variable exons are the second and third exons. For Class II loci, the variable exon is the second exon. The 
primers are preferably located so that a substantial portion of the amplified sequence corresponds to intron 
sequences. 

20 The intron sequences provide restriction sites that, in comparison to cDNA sequences, provide 
additional information about the individual; e.g., the haplotype. Inclusion of exons within the amplified DNA 
sequences does not provide as many genetic variations that enable distinction between alleles as an intron 
sequence of the same length, particularly for constant exons. This additional intron sequence information is 
particularly valuable in paternity determinations and in forensic applications. It is also valuable in typing for 

25 transplant matching in that the variable lengths of intron sequences included in the amplified sequence 
produced by the primers enables a distinction to be made between certain heterozygotes (two different 
alleles) and homozygotes (two copies of one allele). 

Allelic differences in the DNA sequences of HLA loci are illustrated below. The tables illustrate the 
sequence homology of various alleles and indicate exemplary primer binding sites. Table 1 is an illustration 

30 of the alignment of the nucleotides of the Class I A2, A3, Ax, A24 (formerly referred to as A9), B27, B58 
(formerly referred to as B17), C1, C2 and C3 allele sequences in intervening sequence (IVS) I and III. (The 
gene sequences and their numbering that are used in the tables and throughout the specification can be 
found in the Genbank and/or European Molecular Biology Laboratories (EMBL) sequence databanks. Those 
sequences are incorporated herein by reference in their entirety.) Underlined nucleotides represent the 

35 regions of the sequence to which exemplary locus-specific or Class l-specific primers bind. 

Table 2 illustrates the alignment of the nucleotides in IVS I and II of the DQA3 (now DQA1 0301), 
DQA1.2 (now DQA1 0102) and DQA4.1 (now DQA1 0501) alleles of the DQA1 locus (formerly referred to as 
the DR4, DR6 and DR3 alleles of the DQA1 locus, respectively). Underlined nucelotides represent the 
regions of the sequence to which exemplary DQA1 locus-specific primers bind. 

40 Table 3 illustrates the alignment of the nucleotides in IVS I, exon 2 and IVS II of two individuals having 
the DQw1 v allele (designated hereinafter as DQwIya and DQw1 v b for the upper and lower sequences in the 
table, respectively), the DQw2 and DQw8 alleles of the DQB1 locus. Nucloetides indicated in the DQwl v b, 
DQw2 and DQw8 allele sequences are those which differ from the DQwIya sequence. Exon 2 begins and 
ends at nt 599 and nt 870 of the DQwIva allele sequence, respectively. Underlined nucleotides represent 

45 the regions of the sequence to which exemplary DQB1 locus-specific primers bind. 

Table 4 illustrates the alignment of the nucleotides in IVS I, exon 2 and IVS II of the DPB4.1, DPB9, 
New and DPw3 alleles of the DPB1 locus. Nucleotides indicated in the DPB9. New and DPw3 allele 
sequences are those which differ from the DPB4.1 sequence. Exon 2 begins and ends at nt 7644 and nt 
7907 of the 0PB4.1 allele sequence, respectively. Underlined nucleotides represent the regions of the 

so sequence to which exemplary DPB1 locus-specific primers bind. 



55 
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TABLE 1 



Class I 


Sea 


CI 


1 


GATTAGCAATATTGTGCGAOCTACIGTATCAATAAAC 


C2 


1 


T 


CI 




AAAj*j\GGA.^ACTGGTCKT ATGAGAATCTCTACC' l(JJ IGCTTTCAGACAA 


ro 


JO 


G G 


CI 


88 


C^CTTCACCAG3TTTAMGAGAAMCT0CTGACTCTAC^CC 




PA 


* 




i 
i 


GAGCTCACrCTCTGGCATCAAGTTC TCCGTG 


CI 


138 


AGGGCGAGGTCACrGTCTGGCAGCMGTT^ 


C2 


138 


T 




1 


MGCTTACTCTCTGGC^CCAAAC TCCATGGGATGATTTTTGCTTCC TAG 


P97 


JZ 


ATCAGTTTCCCT 


CI 


183 


TACMGAGTCCMGGGC^GAGGTAAGTGTCCTTT AT TTTGCTGGATGTAG 


C2 


187 




A 9 
AZ 




MG^.GTCCAGGTGGACAGGTAA GGAGTGGGAGT CAGGGAGTC 


BZ / 


/ / 
44 


ACACAAGA TCCAAGAGGAGAGGTAA GGAGT GAG AGGCAGGGAGTC 


CI 


238 


TTTAATATTACCT GAGGTAAGGTAA . GGC AAAGAGTGGG AGGCAGGGAGTC 


C2 


237 


C - G 


AZ 


yo 


CAGTTCCAGGGACAGAGATTACQGGATaAAAAGTGAAAGGAGAGGGACG GGGCCCAT 


DOT 

B27 


0 1 

91 


CAGTT CAGGGACAGGGATTa^O^GGAGMGTGAAGGGGAAGC GGG TGGGC 


CI 


288 


CAGTT CAG<jC^CGGG3ATTCCAGGAGAAG TGAAGGGGAAG GGGCTGGGCG 


C2 


288 


— 


A 9 




GOCGAG GGTTTCKCCTTGTTTCT CAGACAGCTC TTGGGOCA A GAC 


DZ/ 


141 


Ga^CTCGG03TCIXTTCCCT G3AC 


t-1 


3Jo 


CAGCC TGGG&GICICICCCT^ GCC AG3AC 


Cz 


337 


— GG 


A 9 
nZ 




TCAQGGAGACATTGAGACAGAGC GCTTGGCACAGMGCAGAGGGGTCAGGG 


B27 


191 


TCAG9CAGACAGTX7TGACAAAGAGGCT GGTGTAGGAGAAGAGGGATCAGG 


CI 


388 


TCAGGC^CACAGTGTCACAAAGATQOT 


C2 


387 


G 


A2 


246 


OGAA GTQZAGGGXOCAGGCGTTGGCICIX^ 


A3 


1 




Ax 


1 




A24 


1 




B27 


241 


AOGAAOGTCXAAQaXaXGXG CG3 TCTXIAGGGTCICAGGCTOCGAGAG 


CI 


438 


ACGAA GTCCCAQGTCCCGGGCG GGGTTCIT^GGGTCTC^GGC^ 


C2 


438 


-A 
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A2 296 CGoTGTATGGATTCGGjAGTCOZAG^^ AGTT 

A3 9 T A 

Ax 9 TG G C 

A24 11 - T 

B27 291 CCTTGTCTGCATTGGGGAGXG2AG\GTTGGGG TIOCOCACTCOCACGAGTT 

CI 488 CCGTGTCTQCACTGGGGAGXGCD3CGTO 

C2 488 



A2 348 TUTTTTCTCCC TCTCCCAAOCTA TGTAGGGTCCTTC7TCCTGGAT ACTCAC 

A3 60 CTG C A G 

Ax 61 tC A GC AC C 

A 24 61 7G- 

B27 344 TCACTTCT TCTXCMCCTATGTCGGGTCCTrCTT(XAGGAT ACTCGT 

CI 538 G TTCACTTCITCTCCCMCCTXX^ ACTCAT 

C2 538 T A 

C3 1 T G G 

A2 399 GACGCG3A(TCAGTTCTC\CTC^ AGAGAAG C " 

A3 114 

Ax 109 A A TCA -T 

A24 Hi G 

B27 392 GAGGCGTCCCCATTTC CACICCCATTOGGTCTCGGGT GTCTAGAGAAG C 

B58 1 

CI 588 GACGCGTCCCCMTTXCA^^ TCT AGAAG C 

C2 589 - AG 

C3 36 -ACCNN G 

A2 449 C^ATCAGTGTCGTOXGG^XZGOG3TTC^AMGT CCGCACG 

A3 164 T C 

Ax 159 G C C C C 

A24 161 A ... T 

B27 442 CMTC^GTGTCGCCG3GGTCCC\GTTCTAAAGT CCCCACG 

B58 12 

CI 635 CMTCAQCGTCT033CAGTCQCQGTTCTAMGICCC CAGT 

C2 637 C 

C3 87 GG G 

A2 489 CACCCACCQGGACTCAGA TTaXCCCAGAOGCOGAGGATGGC C 

A3 204 TCGTGGAGACCAGGC 

Ax 199 T G 

A24 201 

B27 482 CACCCACCCGGACTCAGA ATCTCETCAGACGCCGAG ATGCG G 



B58 52 

CI 675 CACCCACCCGGACTCAGA TTCTOZCCAGACGGCGAG ATGCG 
C2 677 G 
C3 127 
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1st EXDN 


A2 


532 


GTCATGGCGXCnir\ACCCTOGTCCTGCTAC 1CTCGGGGCCTC 


A3 


262 


C C 


Ax 


242 


C C G A C 


A 24 


244 


G C 


B27 


524 


GTCACGGCGOCCCGMCGCnurra^ 


B58 


94 


G 


CI 


717 


GTCATGXGX(£GMCCCT^ 


C2 


719 




C3 


169 


G 


A2 


574 


TGGOCCTGACCCAGAOCTGGGCGG 


A3 


305 




Ax 


285 


C 


A 24 


2S7 


A 


B27 


567 


TGGCCCTGACOGAGACCTGGGZTG 


B58 


137 


C 


CI 


760 


TGGCCCIX3ACCGAG\COiGGGXT 


C2 


762 


_ 


C3 


212 


G 






IVS1 


A2 


599 


GTGAGTGCGGGGTOGSG AGGGAAACG GCC TCTGT GGQGAGAAGCAACGGQCC G 


A^ 




C AC C G T 


Ax 


309 


A T C T-G — G NG G CG 


A 24 


311 


TOG C C G CG 


B27 


591 


GTGAGTGCGGGGTCAGGCAG3GAAATG GCC TCTGT QSQGAGGAGOGAGGGGA 03 


B58 


161 


G - C 


CI 


784 


GTGAGTGCQGGGTTGGG AGGGAAACG GCC TCP GCQGAGAQGAACGAQGTGXOG 


C2 


786 


G G 


C3 


236 


T T G G 


A2 


652 


arrax gggggogcaggacoogogmgo^ 


A1 




G G C 


Ax 


357 


C G T AG A 


A24 


367 


A 


B27 


645 


CAGGC CGGGGCGCAGGAOCGGGGGAGCC^ 


B58 


215 


T A 


CI 


838 


OCOGGC AGG Q3ZA(£k<X:<J£^ 


C2 


840 


G G - AGC 


C3 


291 


GGA G 


A2 


711 


a^OXXTCGTCCCCAG 


A3 


442 


G -C 


Ax 


417 


TC CT 


A24 


426 




B27 


703 


ocarnrroGxarAG 


B58 


273 




CI 


895 


030CTXXTO]CCCCCAG 


C2 


898 


T 


C3 


351 
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IVS3 

A2 1515 GTA(XAQGGGCCAa3GQ QC^^ 

A3 1265 - 

Ax 1222 C ACA - 

A24 1228 G 

B27 1508 CTAOCAGGGGCAGTOjGGAGOCTTOCG^ 

B58 1082 

CI 1704 GTACCAGGGGCAGTGGGGAGCCTTOCCCATCTCCCGTAGATCTC^ 

C2 1705 T G 

C3 1155 - T G 

A2 1574 ACMGGAGGGGAG^CAATIGGGAGCMCACTAGMTATarCCrc 

A3 1303 * C C G A T T 

Ax 1280 A A A T 

A24 1287 C 

827 1567 AOC^GMGAGGAGGAAMTOGGATCAGCGCT 

B58 1141 

CI 1763 ACGAGGAGGGGAQGAAMTGGGATCAGCGCTAGMTATCX^ 

C2 1764 

C3 1213 

A2 1627 CCTGAGGGAGAQGAATCCICCTCOT 

A3 1356 T . T T T - GA G 

Ax 1333 T T 

A24 1341 T 

B27 1620 GGAGAATGGCATGAG'rrriLCTGAGTrTC 

B58 1194 

CI 1816 GGAGMTGGGATGAGTTTTCCTGAGTTTC 

C2 1817 

C3 1266 

A2 1678 CIXnt^GGT IXTGCO^IGCTCTCTC^ CACMTTMGGGATAAAATCTCTGAAGGA 

A3 1406 T G A A -G 

Ax 1372 G G G - 

A24 1392 C 

B27 1649 CICTGAGGQOXCCTCTraxriCT AQGACMrTMGGGATGACGTCTCTGAGGAA 

B58 1223 _ 

CI 1845 CTCTGAGGGCCCCI1CTGCICTCT AGGACMTTMGGC^TGMGTCCTTGAGGAA 

C2 1846 

C3 1295 G A 

A2 1733 ATGACGGG MGACGATCCCTCGMTACIGATGAGTGGTrcGCTTTGACAC 

A3 1460 G T TG T G G 

Ax 1426 ATGAA GAG 

A24 1447 A C 

B27 1704 ATGGAGGGGMGACAGTCCCTAG^TACTGATCAGGGGTXOCrrTTGACCC 

B58 1278 

CI 1900 ATG3AGGGGMGACAGTCnCTGGMTACIGATCAGGOGT^ 

C2 1901 

C3 1351 A 
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A2 1783 ACACAGGCAGCAGCdTCa: CGCG TGACnTTC CrCTCAOGQ-i'lOl ICT OGC 

A3 1510 C GA G 

Ax 1477 T C 

A24 U97 C A 

827 1755 CTGCAGCAGCCTTGGGAAGCG TGACTTTTCCTCTCAG3CCTTGTTCACAGC 

B58 1329 T T 

CI 1951 CTTTCACCACTGCAGCAG CICTQGT^ CIOCAGGOCTTCTnOCTGC 
C2 1952 

C3 1411 

A2 1837 TTC^CACTCAAIUICTGTG^ 

A3 1560 C 

Ax 1528 * C C 

A24 1547 c 
B27 1806 CTC^CACIOCTCTGTTTGGGGCrCTG 
B58 1380 

CI 2013 CTCACGTTCAATGTGTTTGMGGTTTGATTGCAGCTTTT^ 
C2 2014 

C3 1464 C 

A2 1891 TOu\CrCAGGTCAGGACD^ TTTCCACGGAATAG 

A3 1614 TC A 

Ax 1567 T 

A24 1600 A 

B27 1860 TCCIACTCAGATCAGGAGC AGMGIU3CIGTO (XAACTTTOCAATGAATAG 
B58 1434 

CI 2067 TCCACTCAGGTCAGGACC\GMGTC& 
C2 2068 
C3 1518 

A2 1955 GAGAmTCCCAGGTGCCrcTGTCCAGGCr^ 
A3 1664 — 

Ax 1632 T T C T T • 

A24 1650 — A A T G 

B27 1925 GAGATTATCQZAQjTXXO GCGTOCAGSC^^ CTT CCCCA 
B58 1499 

CI 2132 GAGATTATCCCAGGTGOCnxnGTOC^^ 
*C2 2133 
C3 1583 
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A 2 2014 TCO^GGTGTCCTGTCi^TtarCAAGA TAGOCACATCjTGIXjCTOGAOGAGTGTCOZATG 

A3 1721 G G C T 

Ax 1691 C T CA A G C T 

A24 1706 G CA T 

B27 1983 QCCC^GGTGTCCTGTOCATTCTC AG GCTQGTCACATCGGTGGTCOAGGG TGTCGCATG 

B58 1557 A 

CI 2191 OCO^GGTGTGCTGTCCATTCTC AGGATQG TCACATGGGCGCTGTTGGACTGTCGC AAG 
C2 2192 A 
C3 1642 G 

A2 2073 AC^GATCGAAMTGCCTGMTGATCTGACTCT TCCTGACAG 2113 

A3 1780 GC TT C T 1820 

Ax 1750 *GC TT TT C T 1791 

A24 1765 G GCAAAA C T 1784 

B27 2042 AGAGATGCAMGa^CTGMTTTTCTGACTCrTCCCAT CAG 2083 

B58 1616 l6 56 

CI 2250 AGAG^TACAMGTGTCTGMTTTTCTGACTCTrCCCGT CAG 2290 

C2 2251 G 2292 

C3 1701 1741 
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TABLE 2 



10 



15 



DQA1 Seq 

A3 1 GATCTCTGTGTAGAATGTCCTG TTCTGAGCCAGTCCTGA GAGGAAAGCAAGTATAATrAA 

A1.2 1 G A 

A4.11C G A A C G 

A3 61 TTTGTTATTAACTGATGAAAGAATTAAGTGAAAGATAAACCTTAGGAAGC AGAGGGAAGT 

A1.2 61 CA T C C 

A4.1 61 G T C A 

A3 121 TAA TCTATGACTAAGAAAGTTAAGTACTCTGATAACTCATTCATTCCTTCT 
A1.2 12 2 A CCTAA T C C A A 

A4.1 122 A CCTAA C C A CA A 

A3 17 2 TTTGTTCATTTACATT ATTTAATCACAAGTCTATGATGTGCCAGGCTCTCAGGAAATA 
A1.2 178 A T C C A 

A4.1 17 8 A G T CG A 

20 A3 23 0 GTGAAAATTGG CACGCGATATTCTGCCCTTGTGTAGCACACACCGTAGTGGGAAAG 

A1.2 236 A AT G TAG 

A4 . 1 237 A C ATT G TTA 

A3 28 6 AA GTGCACTTTTAACCGGACAACTATCAACACGAAGCGGGGAGGAAGCAGGGG 

A1.2 293 A T C T A 

A4 . 1 29 4 A C A C AT AT 

25 

A3 339 CTGGAAATGTCCACAGACTTTGCCAAA GACAAAGCCCATAATATCTGAAAGTCAG 

A1.2 347 G AA TG T 

A4.1348T GG TG G T 

A3 3 94 TTTCTTC CATCATTTTGTGTATTAAGGTTCTTTATTCCCCTGTTCTCTG CCTTCCT 
30 AI.2 403 G CT C T C 

A4 . 1 4 03 CT TCAT G C CA 

A3 4 50 GCTTGTCATCTTC ACTCATCAGCTGACCATGTTGCCTCTTACGGTGTAAACTTGTACCAG 
A1.2 459 C GT ' 

A4 . 1 462 *C C T 

35 A3 510 TCTTATGGTCCCTCTGGGCAGTACAG CCATGAATTTGATGGAGA CGAGGAGTTCTAT 
A1.2 519 T C C C " T C G* 

A4. 1 522 C C C ; T C C 

A3 567 GTGGACCTGGAGAGGAAGGAGACTGTCTGGCAGTTGCCTCTGTTCCGCAGATTTA 
A1.2. 576 C G G GA A A G 

40 A4 . 1 579 G TGT G TC A ACA 

A3 62 2 GAAGATTTGACCCGCAATTTGCACTGACAAACATCGCTGTGCTAAAACATAACTTGA 
A1.2 631 G T GGG G G GC C 

A4.1 634 C 



45 



A3 67 9 ACATCGTGATTAAACGCTCCAACTCT ACCGCTGCTACCAATGGTATG TGTCCACCATTCTG 

A1.2 688 A A C 

A4.1 68 8 GTC A A 



50 



55 
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w 



DQAl Seq (cont.) 



A3 74 0 CCTTTCTTTAC TGATTTATCCCTTTATACCAAGTTTCATTATTTTCTTT 

A1.2 749 C TTAA A GC CC G C 

A4 . 1 749 CC C A 

A3 789 CCAAGAGGTCCCCAGATC 806 

A1.2 302 819 

A4 . I 798 815 



20 



25 



30 



40 



45 



SO 



55 
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TABLE 3 

DQB1 Seq 

1 MGCTTGTCCItrTTTCCATGMTAMTGTCTCTATCTAOGA 

GG T T A 



51 GTAGG TGCTTTniAAGMAGAAGGGAGTGA ACC7CAACQGG AOTGGGA G 

TT TT 
C * AC C TTT TA C CA AC GTGA CA C 

AT AT C A 

101 GGTAMTCTAQGCATC<jGMGGMQGTAT^ 

C 

G 

151 J^CGCGTGTCAGMCGAGXI^^ 

G A G - A T G 

A A T CG A 

201 TCCGTTGMCTCTCAGATTTATCTG^ 

C G G C 

C A G T T 

251 GGAGCTTCATGAAAMTGGGATTTCATGCGAGMCGOZCTC^ CCCTCTA 
C G A 

CA G G T 

301 AGTQCAGAGGTQ CATGTAAMTCAQCCCGAC^ 

C AT 
CT C C 

351 CAGGCICAGGCAGGGACAGGGCITTOirK^ 

CG A CC 
C G CC C 

401 C AGATTCCAGMQXCGCAMGMGGCGGGCAC^ 

CG CACCGG G - N N N 

G C C G G G 

451 GGGAQGATCC(2AGGTCP33AGCGa2AG^ 
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501 GTCGCG03GXGGTTCCACAGCTa^GG033GGTCAGGXGXGXTGOG 

T G T 

G 

551 GGGXGGaXGXTOGGGCC TGACTGA02GGOCQGTGATIXXCCOCAGAG 

A OCA 

GGGGCGGGGCC 

601 GATTTCGTGTAQZAGTTTMGGOZATGTGCTACTTC^ 

A 



651 GCGCGTGCGTCTTGT AACHAGACACATCT AT AAOGGAGAGGAGTACGCGC 

G G AG A AT T 

G T A 

701 GCTTCGACAGCGACGTCQGQGTGTAC^ 

AT T T T 

T C 

751 CCTGTTGGCGAGTACTGGMCAGCCAGMG3MGTCCT 



GC CA AA 

GC 

801 GGCGGAGTTGGA CACQGTGTGCAGACACMCTACGAGGTGGGGTAGCGCG 



CG G CTACTA 

A C T A CT A 

851 GGATCCK£AGAGGAGAGGTGAGCITO AGCC 

G 

CCT CC GG -TTCGCC 

CCT CC G G GCCT 

901 TTGGGCGGGACCGCGAGTCTCTGTGGCGGGAGG^ ATGGGGGCGAGGTC 

A CA GCAATTC 

A G A GOG GCGAA C C 

951 TCTGAAATCTTGAQXCAGTTCATTCC^^ 

-C - C GG 

GC TT -CTGC-AA 
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1001 GGQGGGTGGTGGGGGCAGGTGCATCGGAGGGGCGG 
COST - C T A 

1051 CAGGGGGA(^(X^GAGTTGGa^G3CTGarrAGTGTaXira^Ga^ 
G T A T G - T 

1101 OajTCCGTCGGaZTOGTCCIC^ 

C ■ 

C C c - T 

1151 TATQCGTTTGCCTCCTG GTGCCT^ 

TA 

1201 CCC\GTGCOCACariCITCCCCT^^ 

ATT G C CG G 

1251 AC02AGCMGXCCACAGTCGaXATTaXOQCA GGAAGCTT 1292 

T CG 

G T CTA A AGC CATG AGTGGGAAGCTT 



EP 0 414 469 A2 



TABLE 4 • 

DPB1 Seq 

DPB4.1 7546 GGGMGATTTOjGAAGAATCGTTAATAT 



DP84.1 7574 TGAGAGAGAGAGGGAG^MGAGGATTAGATGAGAGTGGCGQTrCCQCICATGTCCGCCC^ 



DP34.1 7634 CTGCGCIXAGAGMTTACCTTTTCCAQ3 
DPB9 GGAT G GCA TT 

New GGAT G GCA TT 

DPw3 



BPB4.1 7694 CAGCGCTTCCTGG AGAGAT ACATCTACAACCGGGAGGAGTTCGCXXIOCTTCGACAGCGAC 

DPB9 T 

New X 
DPw3 

DPB4 . 1 7754 GTGGGGGAGTTGCGGGCGGTGACGGAGCTGGGGCGGOCTGCTGCGGAGTACTGG 

DPB9 A A C 

New A A C 
DPw3 



DPB4.1 7814 CAGMGGACATCCTGGAGGAGMGCGGGCAGTGCCG3ACAGGATGTOCL\GACACAAC7AC 
DPB9 G G A 

New C G A 

DPw3 C G A 



DPB4.1 7874 GAGCTGOXGOXCCATGACCCTGCAGCGCCGAGGTGAGT 
DPB9 A A G G 

New A A G G 

DPw3 A A G G ' - 



DPB4.1 7934 CCCAGGXAGCCCCGCCGXCCGTGCOCAG 



Primers for HLA loci 

Exemplary HLA locus-specific primers are listed below. Each of the primers hybridizes with at least 
about 15 consecutive nucleotides of the designated region of the allele sequence. The designation of an 
exemplary preferred primer together with its sequence is also shown. For many of the primers, the 
sequence is not identical for all of the other alleles of the locus. For each of the following preferred primers, 
additional preferred primers have sequences which correspond to the sequences of the homologous region 
of other alleles of the locus or to their complements. 

In one embodiment. Class I loci are amplified by using an A, B or C locus-specific primer together with 
a Class I locus-specific primer. The Class I primer preferably hybridizes with IVS III sequences (or their 
complements) or, more preferably, with IVS I sequences (or their complements). The term "Class l-specific 
primer", as sued herein, means that the primer hybridizes with an allele sequence (or its complement) for at 
least two different Class I loci and does not hybridize with Class II locus allele sequences under the 
conditions used. Preferably, the Class I primer hybridizes with at least one allele of each of the A. B and C 
loci. More preferably, the Class I primer hybridizes with a plurality of. most preferably all of, the Class I 
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allele loci or their complements. Exemplary Class I locus-specific primers are also listed below. 



HLA Primers 



A locus-specific primers 
allelic location: nt 1735-1757 of A3 
designation: SGD009.AIVS3.R2NP 
sequence: CATGTGGCCATCTTGAGAATGGA 
allelic location: nt 1541-1564 of A2 
designation: SGD006.AIVS3.R1NP 
sequence: GCCCGGGAGATCTACAGGCGATCA 
allelic location: nt 1533-1553 of A2 
designation: A2.1 

sequence: CGCCTCCCTGATCGCCTGTAG 
allelic location: nt 1667-1685 of A2 
designation: A2.2 

sequence: CCAGAGAGTGACTCTGAGG 
allelic location: nt 1704-1717 of A2 
designation: A2.3 
sequence: CACAATTAAGGGAT 

B locus-specific primers 
allelic location: nt 1108-1131 of B17 
designation: SGD007.BIVS3.R1 NP 
sequence: TCCCCGGCGACCTATAGGAGATGG 
allelic location: nt 1582-1604 of B17 
designation: SGD010.BIVS3.R2NP 
sequence: CTAGGACCACCCATGTGACCAGC 
allelic location: nt 500-528 of B27 
designation: B2.1 

sequence: ATCTCCTCAGACGCCGAGATGCGTCAC 
allelic location: nt 545-566 of B27 
designation: B2.2 

sequence: CTCCTGCTGCTCTGGGGGGCAG 
allelic location: nt 1852-1876 of B27 
designation: B2.3 

sequence: ACTTTACCTCCACTCAGATCAGGAG 
allelic location: nt 1945-1976 of B27 
designation: B2.4 

sequence: CGTCCAGGCTGGTGTCTGGGTTCTGTGCCCCT 
allelic location: nt 2009-2031 of B27 
designation: B2.5 

sequence: CTGGTCACATGGGTGGTCCTAGG 
allelic location: nt 2054-2079 of B27 
designation: B2.6 

sequence: CGCCTGAATTTTCTGACTCTTCCCAT 

C locus-specific primers 
allelic location: nt 1182-1204 of C3 
designation: SGD008.CIVS3.R1NP 
sequence: ATCCCGGGAGATCTACAGGAGATG 
allelic location: nt 1665-1687 of C3 
designation: SGD01 1 .CIVS3.R2NP 
sequence: AACAGCGCCCATGTGACCATCCT 
allelic location: nt 499-525 of C1 
designation: C2.1 

sequence: CTGGGGAGGCGCCGCGTTGAGGATTCT 
allelic location: nt 642-674 of CI 
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designation: C2.2 

sequence: CGTCTCCGCAGTCCCGGTTCTAAAGTTCCCAGT 
allelic location: nt 738-755 of C1 
designation: C2.3 
5 sequence: ATCCTCGTGCTCTCGGGA 
allelic location: nt 1970-1987 of C1 
designation: C2.4 

sequence: TGTGGTCAGGCTGCTGAC 
allelic location: nt 2032-2051 of C1 
w designation: C2.5 

sequence: AAGGTTTGATTCCAGCTT 
allelic location: nt 2180-2217 of C1 
designation: C2.6 ♦ 

sequence: CCCCTTCCCCACCCCAGGTGTTCCTGTCCATTCTTCAGGA 
is allelic location: nt 2222-2245 of C1 
designation: C2.7 

sequence: CACATGGGCGCTGTTGGAGTGTCG 

Class I loci-specific primers 
allelic location: nt 599-620 of A2 
20 designation: SGD005.IIVS1.LNP 

sequence: GTGAGTGCGGGGTCGGGAGGGA 
allelic location: nt 489-506 of A2 
designation: 1.1 

sequence: CACCCACCGGGACTCAGA 
25 allelic location: nt 574-595 of A2 
designation: 1.2 

sequence: TGGCCCTGACCCAGACCTGGGC 
allelic location: nt 691-71 1 of A2 
designation: 1.3 
30 sequence: GAGGGTCGGGCGGGTCTCAGC 
allelic location: nt 1816-1831 of A2 
designation: 1.4 

sequence: CTCTCAGGCCTTGTTC 

allelic location: nt 1980-1923 of A2 
35 designation: 1.5 

sequence: CAGAAGTCGCTGTTCC 
DGA1 locus-specific primers 

allelic location: nt 23-41 of OQA3 

designation: SGD001.DQA1.LNP 
40 sequence: TTCTGAGCCAGTCCTGAGA 

allelic location: nt 45-64 of DQA3 

designation: DQA3 E1a 

sequence: TTGCCCTGACCACCGTGATG 

allelic location: nt 444-463 of DQA3 
45 designation: DQA3 Elb 

sequence: CTTCCTGCTTGTCATCTTCA 

allelic location: nt 536-553 of DQA3 

designation: DQA3 E1c 

sequence: CCATGAATTTGATGGAGA 
so allelic location: nt 705-723 of DQA3 

designation: DQA3 Eld 

sequence: ACCGCTGCTACCAATGGTA 

allelic location: nt 789-806 of DQA3 

designation: SGD003.DQA1 RNP 
55 sequence: CCAAGAGGTCCCCAGATC 
DRA locus-specific primers 

allelic location: nt 49-68 of DRA HUMMHDRAM (1 183 nt sequence. Accession No. K01 171) 

designation: DRA El 
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sequence: TCATCATAGCTGTGCTGATG 

allelic location: nt 98-1 18 of DRA HUMMHDRAM (1183 nt sequence. Accession No. K01 171) 
designation: DRA 5 E2 (5 indicates the primer is used as the 5 primer) 
sequence: AGAACATGTGATCATCCAGGC 
5 allelic location: nt 319-341 of OR A HUMMHDRAM (1 183 nt sequence, Accession No. K01 171) 
designation: DRA 3E2 

sequence: CCAACTATACTCCGATCACCAAT 

DRB locus-specific primers 
allelic location: nt 79-101 of DRB HUMMHDRC (1153 nt sequence, Accession No. K01171) 
io designation: DRB E1 

sequence: TGACAGTGACACTGATGGTGCTG 

allelic location: nt 123-143 of DRB HUMMHDRC (1153 nt sequence, Accession No. K01 171 
designation: DRB 5 E2 
sequence: GGGGACACCCGACCACGTTTC 
15 allelic location: nt 357-378 of DRB HUMMHDRC (1 153 nt sequence, Accession No. K01171) 
designation: DRB 3*E2 
sequence: TGCAGACACAACTACGGGGTTG 

DQB1 locus-specific primers 
20 allelic location: nt 509-532 DQB1 DQwIya 
designation: DQB E1 

sequence: TGGCTGAGGGCAGAGACTCTCCC 

allelic location: nt 628-647 of DQB1 DQwIya 

designation: DQB 5E2 
25 sequence: TGCTACTTCACCAACGGGAC 

allelic location: nt 816-834 of DQB1 DQw1 v a 

designation: DQB 3'E2 

sequence: GGTGTGCACACACAACTAC 

allelic location: nt 124-152 of DQB1 DQwIya 
30 designation: DQB 5IVSla 

sequence: AGGTATTTTACCCAGGGACCAAGAGAT 

allelic location: nt 314-340 of DQB1 DQw1 v a" 

designation: DQB 5'lVS1b 

sequence: ATGTAAAATCAGCCCGACTGCCTCTTC 
35 allelic location: nt 1 140-1 166 of DQB1 DQwIva 
designation: DQB 3IVS2 

sequence: GCCTCGTGCCTTATGCGTTTGCCTCCT 
DPB1 locus-specific primers 

allelic location: nt 6116-6136 of DPB1 4.1 
40 designation: DPB E1 

sequence: TGAGGTTAATAAACTGGAGAA 

allelic location: nt 7604-7624 of DPB1 4.1 

designation: DPB 5 IVS1 

sequence: GAGAGTGGCGCCTCCGCTCAT 
4$ allelic location: nt 7910-7929 of DPB1 4.1 

designation: DPB 3IVS2 

sequence: GAGTGAGGGCTTTGGGCCGG 



so Primer pairs for HLA analyses 

It is well understood that for each primer pair, the 5 upstream primer hybridizes with the 5 end of the 
sequence to be amplified and the 3' downstream primer hybridizes with the complement of the 3 end of 
the sequence. The primers amplify a sequence between the regions of the DNA to which the primers bind 
55 and its complementary sequence including the regions to which the primers bind. Therefore, the each of 
the primers described above, whether the primer binds to the HLA-encoding strand or its complement 
depends on whether the primer functions as the 5 upstream primer or the 3 downstream primer for that 
particular primer pair. 
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In orte embodiment, a Class I locus-specific primer pair includes a Class I locus-specific primer and an 
A, B or C locus-specific primer. Preferably, the Class I locus-specific primer is the 5 upstream primer and 
hybridizes with a portion of the complement of IVS I. In that case, the locus-specific primer is preferably the 
3 downstream primer and hybridizes with IVS III. The primer pairs amplify a sequence of about 1.0 to 
s about 1 .5 Kb. 

In another embodiment, the primer pair comprises two locus-specific primers that amplify a DNA 
sequence that does not include the variable exon(s). In one example of that embodiment, the 3' 
downstream primer and the 5 upstream primer are Class I locus- specific primers that hybridize with IVS III 
and its complement, respectively. In that case a sequence of about 0.5 Kb corresponding to the intron 

w sequence is amplified. 

Preferably, locus-specific primers for the particular locus, rather than for the HLA class, are used for 
each primer of the primer pair. Oue to differences in the Class II gene sequences, locus-specific primers 
which are specific for pn\y one locus participate in amplifying the DRB, OQA1, DQB and DPB loci. 
Therefore, for each of the preferred Class II locus primer pairs, each primer of the pair participates in 

is amplifying only the designated locus and no other Class II loci. 



Analytical methods 



20 In one embodiment, the amplified sequence includes sufficient intron sequences to encompass length 
polymorphisms. The primer-defined length polymorphisms (PDLPs) are indicative of the HLA locus allele, in 
the sample. For some HLA loci, use of a single primer pair produces primer-defined length polymorphisms 
that distinguish between some of the alleles of the locus. For other loci, two or more pairs of primers are 
used in separate amplifications to distinguish the alleles. For other loci, the amplified DNA sequence is 

25 cleaved with one or more restriction endonucleases to distinguish the alleles. The primer-defined length 
polymorphisms are particularly useful in screening processes. 

In anther embodiment, the invention provides an improved method that uses PCR amplification of a 
genomic HLA DNA sequence of one HLA locus. Following amplification, the amplified DNA sequence is 
combined with at least one endonuclease to produce a digest. The endonuclease cleaves the amplified 

30 DNA sequence to yield a set of fragments having distinctive fragment lengths. Usually the amplified 
sequence is divided, and two or more endonuclease digests are produced. The digests can be used, either 
separately or combined, to produce RFLP patterns that can distinguish between individuals. Additional 
digests can be prepared to provide enhanced specificity to distinguish between even closely related 
individuals with the same HLA type. 

35 In a preferred embodiment, the -presence of a particular allele can be verified by performing a two step 
amplification procedure in which an amplified sequence produced by a first primer pair is amplified by a 
second primer pair which binds to and defines a sequence within the first amplified sequence. The first 
primer pair can be specific for one or more alleles of the HLA locus. The second primer pair is preferably 
specific for one allele of the HLA locus, rather than a plurality of alleles. The presence of an amplified 

40 sequence indicates the presence of the allele, which is confirmed by production of characteristic RFLP 
patterns. 

To analyze RFLP patterns, fragments in the digest are separated by size and then visualized. In the 
case of typing for a particular HLA locus, the analysis is directed to detecting the two DNA allele sequences 
that uniquely characterize that locus in each individual. Usually this is performed by comparing the sample 

45 digest RFLP patterns to a pattern produced by a control sample of known HLA allele type. However, when 
the method is used for paternity testing or forensics, the analysis need not involve identifying a particular 
locus or loci but can be done by comparing single or multiple RFLP patterns of one individual with that of 
another individual using the same restriction endonuclease and primers to determine similarities and 
differences between the patterns. 

so The number of digests that need to be prepared for any particular analysis will depend on the desired 
information and the particular sample to be analyzed. For example, one digest may be sufficient to 
determine that an individual cannot be the person whose blood was found at a crime scene. In general, the 
use of two to three digests for each of two to three HLA loci will be sufficient for matching applications 
(forensics, paternity). For complete HLA haplotyping; e.g., for transplantation, additional loci may need to be 

55 analyzed. 

As described previously, combinations of primer pairs can be used in the amplification method to 
amplify a particular HLA DNA locus irrespective of the allele present in the sample. In a preferred 
embodiment, samples of HLA DNA are divided into aliquots containing similar amounts of DNA per aliquot 
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and are amplified with primer pairs (or combinations of primer pairs) to produce amplified DNA sequences 
for additional HLA loci. Each amplification mixture contains only primer pairs for one HLA locus. The 
amplified sequences are preferably processed concurrently, so that a number of digest RFLP fragment 
patterns can be produced from one sample. In this way t the HLA type for a number of alleles can be 

s determined simultaneously. 

Alternatively, preparation of a number of RFLP fragment patterns provides additional comparisons of 
patterns to distinguish samples for forensic and paternity analyses where analysis of one locus frequently 
fails to provide sufficient information for the determination when the sample DNA has the same allele as the 
DNA to which it is compared. 

io The use of HLA types in paternity tests or transplantation testing and in disease diagnosis and 
prognosis is described in Basic & Clinical Immunology. 3rd Ed (1980) Lange Medical Publications, pp 187- 
190, which is incorporated herein by reference in its entirety. HLA determinations fall into two general 
categories. The first involves matching of DNA from an individual and a sample. This category involves 
forensic determinations and paternity testing. For category 1 analysis, the particular HLA type is not as 

J5 important as whether the DNA from the individuals is related. The second category is in tissue typing such 
as for use in transplantation. In this case, rejection of the donated blood or tissue will depend on whether 
the recipient and the donor express the same or different antigens. This is in contrast to first category 
analyses where differences in the HLA DNA in either the introns or exons is determinative. 

For forensic applications, analysis of the sample DNA of the suspected perpetrator of the crime and 

20 DNA found at the crime scene are analyzed concurrently and compared to determine whether the DNA is 
from the same individual. The determination preferably includes analysis of at least three digests of 
amplified DNA of the DQA1 locus and preferably also of the A locus. More preferably, the determination 
also includes analysis of at least three digests of amplified DNA of an additional locus, e.g. the DPB locus. 
In this way, the probability that differences between the DNA samples can be discriminated is sufficient. 

25 For paternity testing, the analysts involves comparison of DNA of the child, the mother and the putative 
father to determine the probability that the child inherited the obligate haplotype DNA from the putative 
father. That is, any DNA sequence in the child that is not present in the mother's DNA must be consistent 
with being provided by the putative father. Analysis of two to three digests for the DQA1 and preferably also 
for the A locus is usually sufficient. More preferably, the determination also includes analysis of digests of 

30 an additional locus, e.g. the DPB locus. 

For tissue typing determinations for transplantation matching, analysis of three loci (HLA A, B, and DR) 
is often sufficient. Preferably, the final analysis involves comparison of additional loci including DQ and DP. 

35 Production of RFLP fragment patterns 



The following table 5 of exemplary fragment pattern lengths demonstrates distinctive patterns* For 
example, as shown in the table, Bsrl cleaves A2 t A3 and A9 allele amplified sequences defined by primers 
SGD005.IIVs1.LNP and SGD009.AIVS3.R2NP into sets of fragments with the following numbers of 
" 40 nucleotides (740, 691). (809. 335. 283) and (619. 462. 256. 93), respectively. The fragment patterns clearly 
indicate which of the three A alleles is present. The following table 5 illustrates a number of exemplary 
endonucleases that produce distinctive RFLP fragment patterns for exemplary A allele sequences. 

Table 5 illustrates the set of RFLP fragments produced by use of the designated endonucleases for 
analysis of three A locus alleles. For each endonuclease, the number of nucleotides of each of the 

45 fragments in a set produced by the endonuclease is listed. The first portion of the table illustrates RFLP 
fragment lengths using the primers designated SGD009.AIVS3.R2NP and SGD005.IIVS1 LNP which pro- 
duce the longer of the two exemplary sequences. The second portion of the table illustrates RFLP fragment 
lengths using the primers designated SGD006.AIVS3.R1 NP and SGD005.IIVS1.LNP which produce the 
shorter of the sequences. The third portion of the table illustrates the lengths of fragments of a DQA1 locus- 

50 specific amplified sequence defined by the primers designated SGD001.DQA1.LNP and 
SGD003.DQA1.RNP. 

As shown in the Table, each of the endonucleases produces a characteristic RFLP fragment pattern 
which can readily distinguish which of the three A alleles is present in a sample. 



55 
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TABLE 5 
RFLP FRAGMENT PATTERNS 

A - Long 



3srl A2 740 691 

A3 809 335 283 

A9 619 462 256 

CfrlOl t A2 1055 399 245 

A3 473 399 247 

A9 786 399 

Drall A2 698 251 138 

A3 369 315 251 247 

A9 596 427 251 

="0*1 A2 728 248 151 

A3 515 225 213 151 

A9 1004 151 

3sul A2 868 547 36 

A3 904 523 
A9 638 419 373 

-:chl A2 104 0 239 

A3 419 375 218 163 

A9 643 419 373 

/.boll A2 1011 165 143 132 

A3 893 194 143 115 

A9 13 4 9 

-ounl A2 698 295 251 

A3 369 364 251 242 

A9 676 503 251 

bssI A2 695 295 251 

A3 366 315 251 242 

A9 596 427 • 251 
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A - Short 



w 



15 



20 



25 



30 



35 



40 



Bsrl A2 691 254 
A3 345 335 283 

A9 619 256 93 

CfrlOl A2 
A3 
A9 

* 

Drall A2 295 251 210 138 

A3 315 251 210 

A9 427 251 210 

Fokl A2 293 248 151 143 129 51 

A3 225 213 151 143 129 51" 

A9 539 151 146 129 

Gsul A2 3 68 61 36 

A3 904 59 
A9 414 373 '.73 

HphI A2 554 339 

A3 411 375 177 

A9 414 373 178 

MboII A2 
A3 
A9 

PDUml A2 295 257 212 59 

A3 364 251 210 72 66 

A9 503 251 211 

PssI A2 295 251 219 72 

A3 315 251 207 72 65 

A9 427 251 208 72 



45 



50 
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Screening Analysis for Genetic Disease 

5 

Carriers of genetic diseases and those affected by the disease can be identified by use of the present 
method. Depending on the disease, the screening analysis can be used to detect the presence of one or 
more alleles associated with the disease or the presence of haplotypes associated with the disease. 

w Furthermore, by analyzing haplotypes, the method can detect genetic diseases that are not associated with 
coding region variations but are found in regulatory or other untranslated regions of the genetic locus. The 
screening method is exemplified below by analysis of cystic fibrosis (CF). 

Cystic fibrosis is*an autosomal recessive diseasie, requiring the presence of a mutant gene on each 
chromosome. CF is the most common genetic disease in Caucasians, occurring once in 2,000 live births. It 

15 is estimated that one in forty Caucasians are carriers for the disease. 

Recently a specific deletion of three adjacent basepairs in the open reading frame of the putative CF 
gene leading to the loss of a phenylalanine residue at position 508 of the predicted 1480 amino acid 
polypeptide was reported [Kerem et al. Science 245:1073-1080 (1989)]. Based on haplotype analysis, the 
deletion may account for most CF mutations in Northern European populations (about 68%). A second 

20 mutation is reportedly prevalent in some Southern European populations. Additional data indicate that 
several other mutations may cause the disease. 

Studies of haplotypes of parents of CF patients (who necessarily have one normal and one disease- 
associated haplotype) indicated that there are at least 178 haplotypes associated with the CF locus. Of 
those haplotypes, 90 are associated only with the disease; 78 are found only in normals; and 10 are 

25 associated with both the disease and with normals (Kerem et al, supra). The disease apparently is caused 
by several different mutations, some in very low frequency in the population. As demonstrated by the 
haplotype information, there are more haplotypes associated with the locus than there are mutant alleles 
responsible for the disease. 

A genetic screening program (based on amplification of exon regions and analysis of the resultant 

30 amplified DNA sequence with probes specific for each of the mutations or with enzymes producing RFLP 
patterns characteristic of each mutation) may take years to develop. Such tests would depend on detection 
and characterization of each of the mutations, or at least of mutations causing about 90 to 95% or more of 
the cases of the disease. The alternative is to detect only 70 to 80% of the CF-associated genes. That 
alternative is generally considered unacceptable and is the cause of much concern in the scientific 

35 community. 

, The present method directly determines haplotypes associated with the locus and can detect 
haplotypes among the 178 currently recognized haplotypes associated with the disease locus. Additional, 
haplotypes associated with the disease are readily determined through the rapid analysis of DNA of 
numerous CF patients by the methods of this invention. Furthermore, any mutations which may be 
40 associated with noncoding regulatory regions can also be detected by the method and will be identified by 
the screening process. 

Rather than attempting to determine and then detect each defect in a coding region that causes the 
disease, the present method amplifies intron sequences associated with the locus to determine allelic and 
sub-allelic patterns. In contrast to use of mutation-specific probes where only known sequence defects can 

45 be detected, new PDLP and RFLP patterns produced by intron sequences indicate the presence of a 
previously unrecognized haplotype. 

The same analysis can be performed for phenylalanine hydroxylase locus nutations that cause 
phenylketonuria and for beta-globin mutations that cause beta-thalassemia and sickle cell disease and for 
other loci known to be associated with a genetic disease. Furthermore, neither the mutation site nor the 

so location for a disease gene is required to determine haplotypes associated with the disease. Amplified 
intron sequences in the regions of closely flanking RFLP markers, such as are known for Huntington's 
disease and many other inherited diseases, can provide sufficient information to screen for haplotypes 
associated with the disease. 

Muscular dystrophy (MD) is a sex-linked disease. The disease-associated gene comprises a 2.3 million 

55 basepair sequence that encodes 3,685 amino acid protein, dystrophin. A map of mutations for 128 of 34 
patients with Becker's muscular dystrophy and 160 patients with Duchenne muscular dystrophy identified 
115 deletions and 13 duplications in the coding region sequence [Den Dunnen et al. Am. J. Hum. Genet. 
45:835-847 (1989)]. Although the disease is associated with a large number of mutations that vary widely. 
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the mutations have a non-random distribution in the sequence and are localized to two major mutation hot 
spots, Den Ounnen et al. supra. Further, a recombination hot spot within the gene sequence has been 
identified (Grimm et al, Am. J. Hum. Genet 45:368-372 (1989)]. 

For analysis of MD, haplotypes on each side of the recombination hot spot are preferably determined. 
5 Primer pairs defining amplified DNA sequences are preferably located near, within about 1 to 10 Kbp of the 
hot spot on either side of the hot spot. In addition, due to the large size of the gene, primer pairs defining 
amplified ONA sequences are preferably located near each end of the gene sequence and most preferably 
also in an intermediate location on each side of the hot spot. In this way, haplotypes associated with the 
disease can be identified. 

io Other diseases, particularly malignancies, have been shown to be the result of an inherited recessive 
gene together with a somatic mutation of the normal gene. One malignancy that is due to such "loss of 
heterogeneity" is retinoblastoma, a childhood cancer. The loss of the normal gene through mutation has 
been demonstrated by detection of the presence of one mutation in all somatic cells (indicating germ cell 
origin) and detection of a second mutation in some somatic cells [Scheffer et al, Am. J. Hum, Genet. 

15 45:252-260 (1989)]. The disease can be detected by amplifying somatic cell, genomic DNA sequences that 
encompass sufficient intron sequence nucleotides. The amplified DNA sequences preferably encompass 
intron sequences locate near one or more of the markers described by Scheffer et al, supra. Preferably, an 
amplified DNA sequence located near an intragenic marker and an amplified DNA sequence located near a 
flanking marker are used. - 

20 An exemplary analysis for CF is described in detail in the examples. Analysis of genetic loci for other 
monogenic and multigenic genetic diseases can be performed in a similar manner. 

As the foregoing description indicates, the present method of analysis of intron sequences is generally 
applicable to detection of any type of genetic trait. Other monogenic and multigenic traits can be readily 
analyzed by the methods of the present invention. Furthermore, the analysis methods of the present 

25 method are applicable to all eukaryotic cells, and are preferably used on those of plants and animals. 
Examples of analysis of BoLA (bovine MHC determinants) further demonstrates the general applicability of 
the methods of this invention. 

This invention is further illustrated by the following specific but non-limiting examples. Procedures that 
are constructively reduced to practice are described in the present tense, and procedures that have been 

30 carried out in the laboratory are set forth in the past tense. 

EXAMPLE 1 

35 

Forensic Testing 

40 DNA extracted from peripheral blood of the suspected perpetrator of a crime and DNA from blood 
found at the crime scene are analyzed to determine whether the two samples of DNA are from the same 
individual or from different individuals. 

The extracted DNA from each sample is used to form two replicate aliquots per sample, each aliquot 
having 1 ug of sample DNA. Each replicate is combined in a total volume of 100 ul with a primer pair (1 ug 

45 of each primer), dNTPs (2.5 mM each) and 2.5 units of Taq polymerase in amplification buffer (50 mM KCI; 
10 mM Tris-HCI, pH 8.0; 2.5 mM MgC^; 100 ug/ml gelatin) to form four amplification reaction mixtures. The 
first primer pair contains the primers designated SGD005.IIVS1.LNP and SGD009.AIVS3.R2NP (A locus- 
specific). The second primer pair contains the primers designated SGD001.DQA1.LNP and 
SGD003.DQA1 .RNP (DQA locus-specific). Each primer is synthesized using an Applied Biosystems model 

so 308A DNA synthesizer. The amplification reaction mixtures are designated SA (suspect's DNA, A locus- 
specific primers), SD (suspect's DNA, DQA1 locus-specific primers), CA (crime scene DNA, A locus-specific 
primers) and CD (crime scene DNA, DQA1 locus-specific primers). 

Each amplification reaction mixture is heated to 94* C for 30 seconds. The primers are annealed to the 
sample DNA by cooling the reaction mixtures to 65* C for each of the A locus-specific amplification 

55 mixtures and to 55 *C for each of the DQA1 locus-specific amplification mixtures and maintaining the 
respective temperatures for one minute. The primer extension step is performed by heating each of the 
amplification mixtures to 72 *C for one minute. The denaturation, annealing and extension cycle is repeated 
30 times for each amplification mixture. 
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Each amplification mixture is atiquoted to prepare three restriction endonuclease digestion mixtures per 
amplification mixture. The A locus reaction mixtures are combined with the endonucleases Bsrl, CfMOl and 
Drall. The OQA1 reaction mixtures are combined with Alul. Cvijl and Ddel. 

To produce each digestion mixture, each of three replicate aliquots of 10 ul of each amplification 
5 mixture is combined with 5 units of the respective enzyme for 60 minutes at 37* C under conditions 
recommended by the manufacturer of each endonuclease. 

Following digestion, the three digestion mixtures for each of the samples (SA, SO, CA and CD) are 
pooled and electrophoresed on a 6.5% polyacrylamide gel for 45 minutes at 100 volts. Following 
electrophoresis, the gel is stained with ethidium bromide. 
w The samples contain fragments of the following lengths: 
SA: 786, 619. 596, 462, 427, 399, 256, 251. 93, 80 

CA: 809. 786, 619, 596, 473, 462, 427, 399, 369, 335. 315, 283, 256. 251, 247, 93, 80 
SO: 388. 338. 332, 277, 219, 194, 122, 102, 89, 79, 64,55 

CD: 587. 449. 388, 338, 335, 332, 277, 271, 219, 194, 187, 122, 102, 99, 89. 88, 79, 65, 64. 55 
75 The analysis demonstrates that the blood from the crime scene and from the suspected perpetrator are 
not from the same individual. The blood from the crime scene and from the suspected perpetrator are. 
respectively. A3. A9, DQA1 0501, DQA1 0301 and A9, A9, DQA1 0501, DQA1 0501. 



20 EXAMPLE 2 



Paternity Testing 

25 

Chorionic villus tissue was obtained by transcervical biopsy from a 7-week old conceptus (fetus). Blood 
samples were obtained by venepuncture from the mother (M), and from the alleged father (AF). DNA was 
extracted from the chorionic villus biopsy, and from the blood samples. DNA was extracted from the sample 

30 from M by use of nonionic detergent (Tween 20) and proteinase K. DNA was extracted from the sample 
from F by hypotonic lysis. More specifically, 100 ul of blood was diluted to 1.5 ml in PBS and centrifuged 
to remove buffy coat. Following two hypotonic lysis treatments -involving resuspension of buffy coat cells in 
water, the pellets were washed until redness disappeared. Colorless pellets were resuspended in water and 
boiled for 20 minutes. Five 10 mm chorionic villus fronds were received. One frond was immersed in 200 ul 

35 water. NaOH was added to 0.05 M. The sample was boiled for 20 minutes and then neutralized with HCl. 
No further purification was performed for any of the samples. 

The extracted DNA was submitted to PCR for amplification of sequences associated with the HLA loci. . 
DQA1 and DPB1. The primers used were: (1) as a 5 primer for the DQA1 locus, the primer designated 
SGD001 .DQA1.LNP (DQA s'lVSI) (corresponding to nt 23-39 of the DQA1 0301 allele sequence) and as the 

40 3 primer for the DQA1 locus, the primer designated SGD003.DQA1.RNP (DQA 3 IVS2 corresponding to nt 
789-806 of the DQA1 0301 sequence; (2) as the DPB primers, the primers designated 5IVS1 nt 7604-7624 
and 3 IVS2 7910-7929, The amplification reaction mixtures were: 150 ng of each primer: 25 u of test DNA; 
10 mM Tris HCl, pH 8.3; 50 mM KCI; 1.5 mM MgCI 2 ; 0.01% (w/v) gelatin; 200 uM dNTPs; water to 100 ul 
and 2.5 U Taq polymerase. 

45 The amplification was performed by heating the amplification reaction mixture to 94* C for 10 minutes 
prior to addition of Taq polymerase. For DQA1, the amplification was performed at 94* C for 30 seconds, 
then 55* C for 30 seconds, then 72 "C for 1 minute for 30 cycles, finishing with 72 *C for 10 minutes. For 
DPB, the amplification was performed at 96* C for 30 seconds, then 65 'C for 30 seconds, finishing with 
65* C for 10 minutes. 

so Amplification was shown to be technically satisfactory by test gel electrophoresis which demonstrated 
the presence of double stranded DNA of the anticipated size in the amplification reaction mixture. The test 
gel was 2% agarose in TBE (tris borate EDTA) buffer, loaded with 15 ul of the amplification reaction 
mixture per lane and electrophoresed at 200 v for about 2 hours until the tracker dye migrated between 6 to 
7 cm into the 10 cm gel. 

55 The amplified DQA1 and DPB1 sequences were subjected to restriction endonuclease digestion using 
Ddel and Mboll (8 and 12 units, respectively at 37* C for 3 hours) for DQA1, and Rsal and Fokl (8 and 1 1 
units, respectively at 37* C overnight) for DPB1 in 0.5 to 2.0 ul of enzyme buffers recommended by the 
supplier, Pharmacia together with 16-18 ul of the amplified product- The digested DNA was fragment size- 
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length separated on gel electrophoresis (3% Nusieve). The RFLP patterns were examined under ultraviolet 

light after staining the gel with ethidium bromide. 

Fragment pattern analysts is performed by allele assignment of the non-maternal alleles using expected 

fragment sizes based on the sequences of known endonuclease restriction sites. The fragment pattern 
s analysis revealed the obligate paternal DQA1 allele to be DQA1 0102 and DPB to be DPwl. The fragment 

patterns were consistent with AF being the biological father. 

To calculate the probability of true paternity, HLA types were assigned. Maternal and AF DQA1 types 

were consistent with those predicted from the HLA Class II gene types determined by serological testing 

using lymphocytotoxic antisera. 
w Considering alleles of the two HLA loci as being in linkage equilibrium, the combined probability of non- 
paternity was given by: 

0. 042 x 0.314 - 0.013 

1. e. the probability of paternity is (1 - 0.013) or 98.7%. 

The relative chance of paternity is thus 74:75. i.e. the chance that the AF is not the biological father is 
is approximately 1 in 75. The parties to the dispute chose to regard these results as confirming the paternity 
of the fetus by the alleged father. 



EXAMPLE 3 

20 



Analysis of the HLA DQA1 Locus 

25 

The three hapiotypes of the HLA OQA1 0102 locus were analyzed as described below. Those 
haplotypes are DAQ1 0102 OR15 Dw2; DQA1 0102 DR16 Dw21; and DQA1 0102 DR13 Dw19. The 
distinction between the haplotypes in particularly difficult because there is a one basepair difference 
between the 0102 alleles and the 0101 and 0103 alleles, which difference is not unique in DQA1 allele 
30 sequences. 

The procedure used for the amplification is the same as that described in Example 1. except that the 
amplification used thirty cycles of 94* C for 30 seconds, 60 *C for 30 seconds, and 72 *C for 60 seconds. 
The sequences of the primers were: 
SGD 001 - 5' TTCTGAGCCAGTCCTGAGA 3'; and 

35 SGD 003 - 5' GATCTGGGGACCTCTTGG 3'. 

These, primers hybridize to sequences about 500 bp upstream from the 5 end of the second exon and^SO 
bp downstream from the second exon and produce amplified DNA sequences in the 700 to 800 bp range. 

Following amplification, the amplified DNA sequences were electrophoresed on a 4% polyacrylamide 
gel to determine the PDLP type. In this case, amplified DNA sequences for 0102 comigrate with (are the 

40 same length as) 0101 alleles and subsequent enzyme digestion is necessary to distinguish them. 

The amplified DNA sequences were digested using the restriction enzyme Alul (Bethesda Research 
Laboratories) which cleaves DNA at the sequence AGCT. The digestions was performed by mixing 5 units 
(1 ul) of enzyme with 10 ul of the amplified DNA sequence (between about 0.5 and 1 ug of DNA) in the 
enzyme buffer provided by the manufacturer according to the manufacturer's directions to form a digest. 

45 The digest was then incubated for 2 hours at 37* C for complete enzymatic digestion. 

The products of the digestion reaction are mixed with approximately 0.1 ug of "ladder" nucleotide 
sequences (nucleotide control sequences beginning at 1 23 bp in length and increasing in length by 1 23 bp 
to a final size of about 5,000 bp; available commercially from Bethesda Research Laboratories, Bethesda 
MD) and were electrophoresed using a 4% horizontal ultra-thin polyacrylamide gel (E-C Apparatus, 

so Clearwater FLA). 

The bands in the gel were visualized (stained) using silver stain technique [Allen et al, BioTechniques 
7:736-744 (1989)]. 

Three distinctive fragment patterns which correspond to the three haplotypes were produced using Alul. 
The patterns (in base pair sized fragments) were: 
55 1. DR15 DQ6 Dw2: 120. 350. 370, 480 

2. DR13 DG6 Dw19: 120, 330, 350, 480 

3. DR16 DQ6 Dw21: 120, 330, 350 

The procedure was repeated using a 6.5% vertical polyacrylamide gel and ethidium bromide stain and 
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provided the same results. However, the fragment patterns were more readily distinguishable using the 
ultrathin gels and silver stain. 

This exemplifies analysis according to the method of this invention. Using the same procedure, 20 of 
the other 32 DR/OQ haplotypes for DAQ1 were identified using the same primer pair and two additional 
enzymes (Ddel and Mbolf). PULP groups and fragment patterns for each of the DQA1 haplotypes with the 
three endonucleases are illustrated in Table 6. 



TABLE 6 
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This example illustrates the ability of the method of this invention to distinguish the alleles and 
ss haplotypes of a genetic locus. Specifically, the example shows that PDLP analysis stratifies five of the eight 
alleles. These three restriction endonuclease digests distinguish each of the eight alleles and many of the 
35 known haplotypes of the locus. The use of additional endonuclease digests for this amplified ONA 
sequence can be expected to distinguish ail of the known haplotypes and to potentially identify other 
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previously unrecognized haplotypes. Alternatively, use of the same or other endonuclease digests for 
another amplified ONA sequence in this locus can be expected to distinguish the haplotypes. 

In addition, analysis of amplified ONA sequences at the DRA locus in the telomeric direction and DQB 
in the centromeric direction, preferably together with analysis of a central locus, can readily distinguish all of 
the haplotypes for the region. 

The same methods are readily applied to other loci. 



EXAMPLE 4 



Analysis of the HLA DQA1 Locus 



The DNA of an individual is analyzed to determine which of the three haplotypes of the HLA DQA1 
0102 locus are present. Genomic DNA is amplified as described in Example 3. Each of the amplified DNA 
sequences is sequenced to identify the haplotypes of the individual. The individual is shown to have the 
haplotypes DR15 DQ6 Dw2; DR13 DQ6 Dw19. 

The procedure is repeated as described in Example 3 through the production of the Alul digest. Each of 
the digest fragments is sequenced. The individual is shown to have the haplotypes DR15 DQ6 Dw2; DR13. 
DQ6 Dw19. 



EXAMPLE 5 



DQA1 Allele-Specific Amplification 

Primers were synthesized that specifically bind the 0102 and 0301 alleles of the DQA1 locus. The 5 
primer was the SGD 001 primer used in Example 3. The sequences of the 3' primers are listed below. 
0102 5' TTGCTGAACTCAGGCCACC 3' 
0301 5 TGCGGAACAGAGGCAACTG 3' 

The amplification was performed as described in Example 3 using 30 cycles of a standard (34* C, 60 "C, 
72 *C) PCR reaction. The template DNAs for each of the 0101. 0301 and 0501 alleles were amplified 
separately. As determined by gel electrophoresis, the 0102-allele-specific primer amplified only template 
0102 DNA and the 0301-ailele-specific primer amplified only template 0301 DNA, Thus, each of the primers 
were allele-specific. 



EXAMPLE 6 



Detection of Cystic Fibrosis 



The procedure used for the amplification described in Example 3 is repeated. The sequences of the 
primers are illustrated below. The first two primers are upstream primers, and the third is a downstream 
primer. The primers amplify a DNA sequence that encompasses all of intervening sequence 1 
5' CAG AGG TCG CCT CTG GA 3'; 
5^ AAG GCC AGC GTT GTC TCC A 3'; and 
3' CCT CAA AAT TGG TCT GGT 5'. 

These primers hybridize to the complement of sequences located from nt 136-152 and nt 154-172, and to 
nt 187-207. (The nucleotide numbers are found in Riordan et al, Science 245:1066-1072 (1989).] 

Following amplification, the amplified DNA sequences are electrophoresed on a 4% polyacryfamide gel 
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to determine the PDLP type. The amplified DNA sequences are separately digested using each of the 
restriction enzymes Alul. Mnll and Rsal (Bethesda Research Laboratories). The digestion is performed as 
described in Example 3. The products of the digestion reaction are eiectrophoresed and visualized using a 
4% horizontal ultra-thin polyacrylamide gel and silver stain as described in Example 3. 

Distinctive fragment patterns which correspond to disease-associated and normal haplotypes 
produced. 



EXAMPLE 7 



Analysis of Bovine HLA Class I 



Bovine HLA Class I alleles and haplotypes are analyzed in the same manner as described in Example 
3. The primers are listed below. 

Bovine Primers (Class I HLA homolog) T m 
5' primer: 5 TOO TGG TCC TGA CCG AGA 3* (62 ' ) 
3 primer: 1) 3' A TGT GCC TTT GGA GGG TCT 5 (62* ) 
(for -600 bp product) 

2) 3' GCC AAC AT GAT CCG CAT 5 (62' ) 
(for -900 bp product) 

For the approximately 900 bp sequence PDLP analysis is sufficient to distinguish alleles 1 and 3 (893 
and 911 bp, respectively). Digests are prepared as described in Example 3 using Alul and Ddel. The 
following patterns are produced fro the 900 bp sequence. 
Allele 1, Alul digest: 712, 181 
Allele 3, Alul digest: 430, 300. 181 
Allele 1. Ddel digest: 445, 201, 182, 28 
Allele 3, Ddel digest: 406, 185, 182, 28, 16 

The 600 bp sequence also produces distinguishable fragment patterns for those alleles. However, those 
patterns are not as dramatically different as the patterns produced by the 600 bp sequence digests. 



EXAMPLE 8 



Preparation of Primers 



Each of the following primers is synthesized using an Applied Biosystems model 308A DNA syn- 
thesizer. 



HLA locus primers 



A locus-specific primers 
SGD009.AIVS3.R2NP CATGTGGCCATCTTGAGAATGGA 
SGD006.AIVS3.R1 NP GCCCGGGAGATCTACAGGCGATCA 
A2.1 CGCCTCCCTGATCGCCTGTAG 
A2.2 CCAGAGAGTGACTCTGAGG 
A2.3 CACAATTAAGGGAT 

B locus-specific primers 
SGD007.BIVS3.R1 NP TCCCCGGCGACCTATAGGAGATGG 
SGD010.BIVS3.R2NP CTAGGACCACCCATGTGACCAGC 
B2.1 ATCTCCTCAGACGCCGAGATGCGTCAC 
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B2.2 CTCCTGCTGCTCTGGGGGGCAG 

B2.3 ACTTTACCTCCACTCAGATCAGGAG 

B2.4 CGTCCAGGCTGGTGTCTGGGTTCTGTGCCCCT 

B2.5 CTGGTCACATGGGTGGTCCTAGG 

B2.6 CGCCTGAATTTTCTGACTCTTCCCAT 

C locus-specific primers 
SGD008.CIVS3.R1 NP ATCCCGGGAGATCTACAGGAGATG 
SGD011.CIVS3.R2NP AACAGCGCCCATGTGACCATCCT 
C2.1 CTGGGGAGGCGCCGCGTTGAGGATTCT 
C2.2 CGTCTCCGCAGTCCCGGTTCTAAAGTTCCCAGT 
C2.3 ATCCTCGTGCTCTCGGGA 
C2.4 TGTGGTCAGGCTGCTGAC 
C2.5 AAGGTTTGATTCJCAGCTT 

C2.6 CCCCTTCCCCACCCCAGGTGTTCCTGTCCATTCTTCAGGA 
C2.7 CACATGGGCGCTGTTGGAGTGTCG 

Class I loci-specific primers 
SGD005JIVS1.LNP GTGAGTGCGGGGTCGGGAGGGA 

1.1 CACCCACCGGGACTCAGA 

1.2 TGGCCCTGACCCAGACCTGGGC 

1.3 GAGGGTCGGGCGGGTCTCAGC 

1.4 CTCTCAGGCCTTGTTC 

1.5 CAGAAGTCGCTGTTCC 
DQA1 locus-specific primers 

sgdooTdqai .lnp ttctgagccagtcctgaga 
dqa3 e1a ttgccctgaccaccgtgatg 
dqa3 e1b cttcctgcttgtcatcttca 
dqa3 e1c ccatgaatttgatggaga 

DGA3 Eld ACCGCTGCTACCAATGGTA 
SGD003.DQA1 .RNP CCAAGAGGTCCCCAGATC 
ORA locus-specific primers 

dra"eTtcatcatagctgtgctgatg 
dra 5'e2 agaacatgtgatcatccaggc ' 
ora 3'e2 ccaactatactccgatcaccaat 

DRB locus-specific primers 

drbTeTtgacagtgacactgatggtgctg 
drb 5 e2 ggggacacccgaccacgtttc 
drb 3*e2 tgcagacacaactacggggttg 

DGB1 locus-specific primers 

dqb~eTtggctgagggcagagactctccc 

dqb 5 e2 tgctacttcaccaacgggac 

dgb 3e2 ggtgtgcacacacaactac 

dqb 51vsla aggtattttacccagggaccaagagat 

dqb 5'lvslb atgtaaaatcagcccgactgcctcttc 

dqb 3'lvs2 gcctcgtgccttatgcgtttgcctcct 

DPB1 locus-specific primers 
DPBTTfGAGGTTAATAAACTGGAGAA 
DPB 5 IVS1 GAGAGTGGCGCCTCCGCTCAT 
DPB 3'lVS2 GAGTGAGGGCTTTGGGCCGG 



Claims 

1. A method for detection of at least one allele of a genetic locus comprising amplifying genomic DNA with 
an intron-spanning primer pair that defines a DNA sequence, said DNA sequence being in genetic linkage 
with said genetic locus and containing a sufficient number of intron sequence nucleotides to produce an 
amplified DNA sequence characteristic of said allele. 

2. The method of Claim 1 wherein said amplified DNA sequence includes at least about 300 nucleotides 
corresponding to intron sequences. 
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3. The method of Claim 1 or Claim 2 wherein said intron sequence is adjacent to an exon encoding said 
allele. 

4. The method of any one of Claims 1 to 3 wherein said amplified DNA sequence is characteristic of at 
least one nonadjacent allele. 

5 5. The method of any one of Claims 1 to 3 wherein said amplified DNA sequence is characteristic of at 
least one adjacent allele and at least one nonadjacent allele. 

6. The method of Claim 5 wherein said amplified DNA sequence includes at least about 1,000 nucleotides 
corresponding to intron sequences. 

7. A method for detection of at least one allele of a genetic locus comprising: 

w a. amplifying genomic DNA with an intron-spanning primer pair that defines a DNA sequence, said DNA 
sequence being in genetic linkage with said allele and containing a sufficient number of intron sequence 
nucleotides to produce an amplified DNA sequence characteristic of said allele; and 
b. analyzing said amplified DNA sequence to detect the presence of a genetic variation in said amplified 
sequence. 

15 8. The method of Claim 7 wherein said variation in said amplified DNA sequence is a variation in the length 
of the primer-defined amplified DNA sequence. 

9. The method of Claim 7 or Claim 8 wherein said variation in said amplified DNA sequence is a change in 
the presence of at least one restriction site in the primer-defined amplified DNA sequence. 

10. The method of Claim 7 wherein said variation in said amplified DNA sequence is a change in the 
20 location of at least one restriction site in the primer-defined amplified DNA sequence. 

11. The method of any one of Claims 7 to 10 wherein said variation in said amplified DNA sequence is a 
substitution of at least one nucleotide in the primer-defined amplified DNA sequence. 

12. The method of any one of Claims 7 to 1 1 wherein said genetic locus is a major histocompatability locus. 

13. The method of any one of Claims 7 to 12 wherein said allele is associated with a monogenic disease. 
25 14. The method of Claim 13 wherein said monogenic disease is cystic fibrosis. 

15. The method of any one of Claims 7 to 14 wherein at least about 70% of said primer-defined amplified 
DNA sequence corresponds to intron sequences. 

16. The method of any one of Claims 7 to 15 wherein said primer-defined amplified DNA sequence is from 
300 to 500 nucleotides in length. 

30 17. A method for producing RFLP fragments for an HLA locus of an individual comprising the steps of: 

a. amplifying genomic HLA DNA from said individual with a primer pair specific for said HLA locus under 
conditions suitable to produce an amplified'DNA sequence; and* 

b. producing a digest by combining said amplified DNA sequence with at least one endonuclease that 
cleaves said amplified DNA sequence to yield a set of fragments having distinctive fragment lengths. 

as 18. The method of Claim 17 additionally comprising the step of producing RFLP patterns from said digest. 

19. The method of Claim 17 or Claim 18 wherein said primers define a DNA sequence that contains* all 
exons that encode allelic variability associated with said HLA locus. 

20. A method for producing RFLP fragments for an HLA locus of an individual comprising the steps of: 

a. amplifying genomic HLA DNA from said individual with a primer pair specific for said HLA locus under 
40 conditions suitable to produce an amplified DNA sequence, said primers defining a DNA sequence that 

contains all exons that encode allelic variability associated with said HLA locus; and 

b. producing a digest by combining said amplified DNA sequence with at least one endonuclease that 
cleaves said amplified DNA sequence to yield a set of fragments having distinctive fragment lengths. 

21. A method for producing RFLP patterns for an HLA locus of an individual comprising the steps of: 

45 a. amplifying HLA DNA from said individual with a primer pair specific for said HLA locus under 
conditions suitable to produce an amplified DNA sequence, said primers being located in intervening 
sequence I and in intervening sequence III when said HLA locus is a Class I locus and in intervening 
sequence I and in intervening sequence II when said focus is a Class II locus; 

b. producing a digest by combining said amplified DNA sequence with at least one endonuclease that 
so cleaves said amplified DNA sequence to yield a set of fragments having distinctive fragment lengths; and 

c. producing RFLP patterns from said digest. 

22. The method of Claim 21 wherein said amplification comprises: 

a. combining an HLA-locus specific primer pair with HLA DNA from said individual under hybridizing 
conditions for a period of time sufficient for each primer in said primer pair to produce an extension 

55 product which, when separated from its complement, can serve as a template for synthesis of the 
extension product of the other primer to produce a mixture; 

b. treating said mixture under denaturing conditions to separate the primers from their extension 
products; 
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c. treating said mixture with said HLA locus-specific primer pair such that a primer extension product is 
synthesized using each of the templates produced in step (b) as a template, resulting in amplification of 
the HLA ONA; and 

d. repeating steps (b) and (c) to produce an amplified DNA sequence. 

s 23. The method of Claim 21 or Claim 22 wherein a second primer pair specific for said HLA locus is also 
used to amplify said HLA DNA. 

24. The method of any one of Claims 21 to 23 wherein producing said RFLP fragment pattern comprises: 

a. combining said amplified DNA sequence with at least one endonuclease that cleaves said amplified 
DNA sequence to yield a set of fragments having distinctive fragment lengths; 
w b. separating said fragments based on the length of the fragments to produce separated fragments; and 
c. visualizing said separated fragments to produce RFLP fragment patterns. 

25. The method of Claim 24 wherein said fragments are separated using gel electrophoresis and visualized 
using a nucleotide-spacific stain. 

26. A method for determining whether DNA in a sample is from a particular individual comprising the steps 
75 of: 

a. amplifying DNA from said individual and DNA from said sample with a primer pair specific for an HLA 
locus under suitable conditions to produce an amplified DNA sequence from said individual and from 
said sample, said primers being located in intervening sequences I and III for an HLA Class I locus and 
in intervening sequences I and II for a Class II locus; 

20 b. combining said amplified DNA sequence from said individual and said amplified sample DNA from 
said sample with at least one endonuclease that cleaves said amplified DNA sequence into a plurality of 
cleaved sequences of sufficiently different lengths to distinguish between alleles of said HLA locus for a 
period of time sufficient for digestion of said amplified DNA to produce a digest; and 
c. comparing restriction fragment length polymorphic patterns produced by said digest from said 

25 individual and from said sample. 

27. A method for determining whether an individual is the father of a child comprising the steps of: 

a. amplifying DNA from said individual, DNA from said child and DNA from said child's mother with a 
. pair of primers specific for an HLA locus under suitable conditions to produce amplified DNA sequences, 

said primers being located in intervening sequences I and III for an HLA Class I locus and in intervening 
30 sequences I and II for a Class II locus; 

b. combining said amplified DNA sequence from said individual and said amplified sample DNA from 
said child with at least one endonuclease that cleaves said* amplified DNA sequence into a plurality of 
cleaved sequences of sufficiently different lengths to distinguish between alleles of said HLA locus to 
produce a digest; and 

35 c. comparing restriction fragment length polymorphic patterns produced by said digest from said 
individual, from said child's mother and from said child. 

28. An HLA locus-specific primer selected from the group consisting of Class I locus-specific primer, a . 
Class I A locus-specific primer, a Class I B locus-specific primer and a Class I C locus-specific primer. 

29. The HLA locus-specific primer of Claim 28 wherein said primer has a sequence corresponding to at 
40 least 15 consecutive nucleotides selected from the group consisting of 

CATGTGGCCATCTTGAGAATGGA; 

GCCCGGGAGATCTACAGGCGATCA; CGCCTCCCTGATCGCCTGTAG; 

CCAGAGAGTGACTCTGAGG; CACAATTAAGGGAT; 

TCCCCGGCGACCTATAGGAGATGG; 
45 CTAGGACCACCCATGTGACCAGC; 

ATCTCCTCAGACGCCGAGATGCGTCAC; 

CTCCTGCTGCTCTGGGGGGCAG; 

ACTTTACCTCCACTCAGATCAGGAG; 

CGTCCAGGCTGGTGTCTGGGTTCTGTGCCCCT; 
so CTGGTCACATGGGTGGTCCTAGG; 

CGCCTGAATTTTCTGACTCTTCCCAT; 

ATCCCGGG AG ATCTACAGG AG ATG ; 

AACAGCGCCCATGTGACCATCCT; 

CTGGGGAGGCGCCGCGTTGAGGATTCT; 
55 CGTCTCCGCAGTCCCGGTTCTAAAGTTCCCAGT; 

ATCCTCGTGCTCTCGGGA; TGTGGTCAGGCTGCTGAC; 

AAGGTTTGATTCCAGCTT; 

CCCCTTCCCCACCCCAGGTGTTCCTGTCCATTCTTCAGGA; 
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CACATGGGCGCTGTTGGAGTGTCG; 

GTGAGTGCGGGGTCGGGAGGGA; CACCCACCGGGACTCAGA; 

TGGCCCTGACCCAGACCTGGGC; GAGGGTCGGGCGGGTCTCAGC; 

CTCTCAGGCCTTGTTC; CAGAAGTCGCTGTTCC; 
5 TTCTGAGCCAGTCCTGAGA; TTGCCCTGACCACCGTGATG; 

CTTCCTGCTTGTCATCTTCA; CCATGAATTTGATGGAGA; 

ACCGCTGCTACCAATGGTA; CCAAGAGGTCCCCAGATC; 

TCATCATAGCTGTGCTGATG; AGAACATGTGATCATCCAGGC; 

CCAACTATACTCCGATCACCAAT; 
to TGACAGTGACACTGATGGTGCTG; GGGGACACCCGACCACGTTTC; 

TGCAGACACAACTACGGGGTTG; 

TGGCTGAGGGCAGAGACTCTCCC; TGCTACTTCACCAACGGGAC; 
G G TG TG C AC AC AC AAGTAC ; 
AGGTATTTTACCCAGGGACCAAGAGAT; 
is ATGTAAAATCAGCCCGACTGCCTCTTC; 
GCCTCGTGCCTTATGCGTTTGCCTCCT; 

TGAGGTTAATAAACTGGAGAA; GAGAGTGGCGCCTCCGCTCAT; 
and GAGTGAGGGCTTTGGGCCGG. 

30. An HLA Class I locus-specific primer pair. _ 
20 31. An HLA Class II locus-specific, intron-spanning primer pair. 

32. A DNA sequence defined by an HLA locus-specific primer pair. 

33. A kit comprising at least one HLA locus-specific primer pair in a suitable container, wherein said HLA 
locus-specific primer pair is selected from the group consisting of an HLA Class I locus-specific primer pair 
and an HLA Class II locus-specific, intron-spanning primer pair. 

25 34. The kit of Claim 33 additionally comprising at least one endonuclease that cleaves a DNA sequence 
defined by said HLA locus-specific primer pair into a plurality of cleaved sequences of sufficiently different 
lengths to distinguish between alleles of said HLA locus. 

Claims for the following Contracting State: ES 

30 

I . A method for detection of at least one allele of a genetic locus comprising amplifying genomic DNA with 
an intron-spanning primer pair that defines a-DNA sequence, said DNA sequence being in genetic linkage 
with said genetic locus and containing a sufficient number of intron sequence nucleotides to produce an 
amplified DNA sequence characteristic of said allele. 

35 2. The method of Claim 1 wherein said amplified DNA sequence includes at least about 300 nucleotides 
corresponding to intron sequences. _ 

3. The method of Claim 1 or Claim 2 wherein said intron sequence is adjacent to an exon encoding said 
allele. 

4. The method of any one of Claims 1 to 3 wherein said amplified DNA sequence is characteristic of at 
40 least one nonadjacent allele. 

5. The method of any one of Claims 1 to 3 wherein said amplified DNA sequence is characteristic of at 
least one adjacent allele and at least one nonadjacent allele. 

6. The method of Claim 5 wherein said amplified DNA sequence includes at least about 1,000 nucleotides 
corresponding to intron sequences. 

45 7. A method for detection of at least one allele of a genetic locus comprising: 

a. amplifying genomic DNA with an intron-spanning primer pair that defines a DNA sequence, said DNA 
sequence being in genetic linkage with said allele and containing a sufficient number of intron sequence 
nucleotides to produce an amplified DNA sequence characteristic of said allele; and 

b. analyzing said amplified DNA sequence to detect the presence of a genetic variation in said amplified 
so sequence. 

8. The method of Claim 7 wherein said variation in said amplified DNA sequence is a variation in the length 
of the primer-defined amplified DNA sequence. 

9. The method of Claim 7 or Claim 8 wherein said variation in said amplified DNA sequence is a change in 
the presence of at least one restriction site in the primer-defined amplified DNA sequence. 

55 10. The method of Claim 7 wherein said variation in said amplified DNA sequence is a change in the 
location of at least one restriction site in the primer-defined amplified DNA sequence. 

II. The method of any one of Claims 7 to 10 wherein said variation in said amplified DNA sequence is a 
substitution of at least one nucleotide in the primer-defined amplified DNA sequence. 
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12. The method of any one of Claims 7 to 1 1 wherein said genetic locus is a major histocompatability locus. 

13. The method of any one of Claims 7 to 12 wherein said allele is associated with a monogenic disease. 

1 4. The method of Claim 1 3 wherein said monogenic disease is cystic fibrosis. 

15. The method of any one of Claims 7 to 14 wherein at least about 70% of said primer-defined amplified 
5 DNA sequence corresponds to intron sequences. 

16. The method of any one of Claims 7 to 15 wherein said primer-defined amplified DNA sequence is from 
300 to 500 nucleotides in length. 

17. A method for producing RFLP fragments for an HLA locus of an individual comprising the steps of: 

a. amplifying genomic HLA DNA from said individual with a primer pair specific for said HLA locus under 
io conditions suitable to produce an amplified DNA sequence; and 

b. producing a digest by combining said amplified DNA sequence with at least one endonuclease that 
cleaves said amplified DNA sequence to yield a set of fragments having distinctive fragment lengths. 

18. The method of Claim 17 additionally comprising the step of producing RFLP patterns from said digest. 

19. The method of Claim 17 or Claim 18 wherein said primers define a DNA sequence that contains all 
;s exons that encode allelic variability associated with said HLA locus. 

20. A method for producing RFLP fragments for an HLA locus of an individual comprising the steps of: 

a. amplifying genomic HLA DNA from said individual with a primer pair specific for said HLA locus under 
conditions suitable to produce an amplified DNA sequence, said primers defining a DNA sequence that 
contains all exons that encode allelic variability associated with said HLA locus; and - 
20 b. producing a digest by combining said amplified DNA sequence with at least one endonuclease that 
cleaves said amplified DNA sequence to yield a set of fragments having distinctive fragment lengths. 

21. A method for producing RFLP patterns for an HLA locus of an individual comprising the steps of: 

a. amplifying HLA DNA from said individual with a primer pair specific for said HLA locus under 
conditions suitable to produce an amplified DNA sequence, said primers being located in intervening 

25 sequence I and in intervening sequence III when said HLA locus is a Class I locus and in intervening 
sequence I and in intervening sequence II when said locus is a Class II locus; 

b. producing a digest by combining said amplified DNA sequence with at least one endonuclease that 
cleaves said amplified DNA sequence to yield a set of fragments having distinctive fragment lengths; and 

c. producing RFLP patterns from said digest. 

30 22. The method of Claim 21 wherein said amplification comprises: 

a. combining an HLA-locus specific primer pair with HLA DNA from said individual under hybridizing 
conditions for a period of time sufficient for each primer rn said primer pair to produce an extension 
product which, when separated from its complement, can serve as a template for synthesis of the 
extension product of the other primer to produce a mixture; 

35 b. treating said mixture under denaturing conditions to separate the primers from their extension 
products; 

c. treating said mixture with said HLA locus-specific primer pair such that a primer extension product is. 
synthesized using each of the templates produced in step (b) as a template, resulting in amplification of 
the HLA DNA; and 

40 d. repeating steps (b) and (c) to produce an amplified DNA sequence. 

23. The method of Claim 21 or Claim 22 wherein a second primer pair specific for said HLA locus is also 
used to amplify said HLA DNA. 

24. The method of any one of Claims 21 to 23 wherein producing said RFLP fragment pattern comprises: 

a. combining said amplified DNA sequence with at least one endonuclease that cleaves said amplified 
45 DNA sequence to yield a set of fragments having distinctive fragment lengths; 

b. separating said fragments based on the length of the fragments to produce separated fragments; and 

c. visualizing said separated fragments to produce RFLP fragment patterns. 

25. The method of Claim 24 wherein said fragments are separated using gel electrophoresis and visualized 
using a nucleotide-specific stain. 

50 26. A method for determining whether DNA in a sample is from a particular individual comprising the steps 
of: 

a. amplifying DNA from said individual and DNA from said sample with a primer pair specific for an HLA 
locus under suitable conditions to produce an amplified DNA sequence from said individual and from 
said sample, said primers being located in intervening sequences I and III for an HLA Class I locus and 

55 in intervening sequences I and II for a Class II locus; 

b. combining said amplified DNA sequence from said individual and said amplified sample DNA from 
said sample with at least one endonuclease that cleaves said amplified DNA sequence into a plurality of 
cleaved sequences of sufficiently different lengths to distinguish between alleles of said HLA locus for a 
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period of time sufficient for digestion of said amplified DNA to produce a digest; and 

c, comparing restriction fragment length polymorphic patterns produced by said digest from said 

individual and from said sample. 

27. A method for determining whether an individual is the father of a child comprising the steps of: 

a. amplifying DNA from said individual, DNA from said child and DNA from said child's mother with a 
pair of primers specific for an HLA locus under suitable conditions to produce amplified DNA sequences, 
said primers being located in intervening sequences I and III for an HLA Class I locus and in intervening 
sequences I and II for a Class II locus; 

b. combining said amplified DNA sequence from said individual and said amplified sample DNA from 
said child with at least one endonuclease that cleaves said amplified DNA sequence into a plurality of 
cleaved sequences of sufficiently different lengths to distinguish between alleles of said HLA locus to 
produce a digest; and 

c. comparing restriction fragment length polymorphic patterns produced by said digest from said 
individual, from said child's mother and from said child. 

28. A process which comprises the preparation of an HLA locus-specific primer selected from the group 
consisting of a Class I A locus-specific primer, a Class I B locus-specific primer and a Class I C locus- 
specific primer. 

29. A process according to Claim 28 wherein said primer has a sequence corresponding to at least 15 
consecutive nucleotides selected from the group consisting of 
CATGTGGCCATCTTGAGAATGGA; 

GCCCGGGAGATCTACAGGCGATCA; CGCCTCCCTGATCGCCTGTAG; 

CCAGAGAGTGACTCTGAGG; CACAATTAAGGGAT; 

TCCCCGGCGACCTATAGGAGATGG; 

CTAGGACCACCCATGTGACCAGC; 

ATCTCCTCAGACGCCGAGATGCGTCAC; 

CTCCTGCTGCTCTGGGGGGCAG; 

ACTTTACCTCCACTCAGATCAGGAG; 

CGTCCAGGCTGGTGTCTGGGTTCTGTGCCCCT; 

CTGGTCACATGGGTGGTCCTAGG; 

CGCCTGAATTTTCTGACTCTTCCCAT; 

ATCCCGGGAGATCTACAGGAGATG; 

AACAGCGCCCATGTGACCATCCT; 

CTGGGGAGGCGCCGCGTTGAGGATTCT; 

CGTCTCCGCAGTCCCGGTTCTAAAGTTCCCAGT; 

ATCCTCGTGCTCTCGGGA; TGTGGTCAGGCTGCTGAC; 

AAGGTTTGATTCCAGCTT; 

CCCCTTCCCC ACCCC AGGTGTTCCTGTCCATTCTTC AGG A; . 
CACATGGGCGCTGTTGGAGTGTCG; 

GTGAGTGCGGGGTCGGGAGGGA; CACCCACCGGGACTCAGA; 
TGGCCCTGACCCAGACCTGGGC; GAGGGTCGGGCGGGTCTCAGC; 
CTCTCAGGCCTTGTTC; CAGAAGTCGCTGTTCC; 
TTCTGAGCCAGTCCTGAGA; TTGCCCTGACCACCGTGATG; 
CTTCCTGCTTGTCATCTTCA; CCATGAATTTGATGGAGA; 
ACCGCTGCTACCAATGGTA; CCAAGAGGTCCCCAGATC; 
TCATCATAGCTGTGCTGATG; AGAACATGTGATCATCCAGGC; 
CCAACTATACTCCGATCACCAAT; 

TGACAGTGACACTGATGGTGCTG; GGGGACACCCGACCACGTTTC; 
TGCAGACACAACTACGGGGTTG; 

TGGCTGAGGGCAGAGACTCTCCC; TGCTACTTCACCAACGGGAC; 

GGTGTGCACACACAACTAC; 

AGGTATTTTACCCAGGGACCAAGAGAT; 

ATGTAAAATCAGCCCGACTGCCTCTTC; 

GCCTCGTGCCTTATGCGTTTGCCTCCT; 

TGAGGTTAATAAACTGGAGAA; GAGAGTGGCGCCTCCGCTCAT; 
and GAGTGAGGGCTTTGGGCCGG. 

30. A process which comprises the preparation of an HLA Class I locus-specific primer. pair. 

31. A process which comprises the preparation of an HLA Class II locus-specific, intron-spanning primer 
pair. 
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32. A process which comprises the preparation of a ONA sequence defined by an HLA locus-specific 
primer pair. 

33. A kit comprising at least one HLA locus-specific primer pair in a suitable container, wherein said HLA 
locus-specific primer pair is selected from the group consisting of an HLA Class I locus-specific primer pair 
and an HLA Class II locus-specific, intron-spanning primer pair. 

34. The kit of Claim 33 additionally comprising at least one endonuclease that cleaves a DNA sequence 
defined by said HLA locus-specific primer pair into a plurality of cleaved sequences of sufficiently different 
lengths to distinguish between alleles of said HLA locus. 



